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(1.)  Introduction. 

In  dealing  with  the  problem  of  the  relationship  of  attributes,  not  caj)able  ol 
quantitative  measurement,  it  has  been  usual  to  classify  the  two  attributes  into  a 
number  of  groups,  Aj,  Ao,  A3,  .  .  .  A,,  and  B^,  Bo,  B3,  .  .  .  B/.  In  this  manner  a  table 
has  been  formed  containing  s  columns  and  t  rows,  or  s  X  t  compartments.  The  total 
frequency  of  the  population,  or  of  the  "  universe  "  under  consideration,  to  use  the 
logician's  phrase,  is  then  distributed  into  sub-groups  corresponding  to  these  s  X  t 
compartments.  In  simple  cases  of  association,  as  in  that  of  the  presence  of  the 
vaccination  cicatrix  and  the  recovery  from  an  attack  of  smallpox,  s  and  t  are  both 
equal  to  two,  and  we  liave  a  simple  four-fold  division  of  the  universe.  In  other  cases 
we  have  higlier  niunljers,  as  when  we  classify  the  human  eye  into  eight  colour  classes 
and  correlate  these  classes  with  six  or  more  classes  for  hair  colour.  We  may  even 
run  up  to  as  many  as  18  to  25  classes  for  each  attribute  when  we  table  the  coat 
colours  of  thoroughbred  horses  or  pedigree  dogs  in  the  case  of  pairs  of  blood  relatives, 
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Hitherto,  in  order  to  obtain  a  measure  of  the  degree  of  correlation  or  association,  we 
have  proceeded  on  the  assumption  that  it  was  necessary  to  arrange  the  system  of 
classes  like  Aj,  Aj,  .  .  .  A.j  in  some  order,  which  corresponded  to  a  real  quantitative 
scale  in  the  attribute,  although  we  were  unable  to  use  this  scale  directly.  Thus  one 
arranged  eye-colours  in  what  appeared  to  correspond  to  a  scale  of  varying  amoinits 
of  orange  pigment ;  the  coat  colours  of  horses  were  arranged  in  an  order  corresponding 
fairly  to  what  an  artist  would  call  their  "  value."  I  even  analysed  hair  tints  by 
photographic  processes.  In  all  such  cases  the  order  seemed  of  vital  importance. 
Once  this  order  was  settled,  the  methods  of  my  memoir*  on  the  correlation  of 
characters  not  quantitatively  measurable  could  be  applied — the  actual  scale  corre- 
sponding to  the  classification  coidd  be  deduced,  and  we  were  able,  on  the  assumption 
of  normal  fi-equency,  to  actually  plot  the  regression  lines  for  the  correlation  of  a 
variety  of  attributes,  t  The  conception,  however,  of  order  in  the  classification  was 
at  times  very  hampering.  Take  three  broad  classes  like  those  for  human  temper — 
quick  tempered,  good  7iatured,  and  sullen  ;  it  is  difficult  to  grasp  the  exact  meaning 
of  a  quantitative  scale  at  the  basis  of  this  classification,  and  it  is  not  obvious  that  the 
right  order  is  necessarily  that  with  good-natured  in  the  middle.  Or,  again,  take  the 
case  of  human  hair  ;  omitting  the  brown  reds,  we  can  get  a  practically  continuous  series 
of  shades  from  jet  black  to  flaxen,  and  from  flaxen  with  increasing  red  up  to  the 
deepest  reds.  Only  the  brown  reds  come  in  and  upset  the  system !  We  seem, 
therefore,  forced  to  take  a  double  scale,  first  one  of  black,  and  then  one  of  red 
pigment.  Or,  again,  take  the  coat  colour  of  greyhounds  ;  these  are  classified  into  as 
many  as  40  fairly  narrow  groups,  and  we  can  arrange  these  groups  in  ascending 
order  of  red,  or  black,  or  other  pigmentation.    We  have  more  than  one  possible  scale. 

Now  in  recent  work  on  such  things  as  temper  in  man,  eye  coloin^  in  man,  and  hair 
colour  in  man  or  other  animals,  I  have  proceeded  to  arrange  my  groups  in  two  or 
three  different  orders,  and  to  calculate  the  correlation  on  the  basis  of  these 
different  orders.  The  results  for  the  different  orders  came  out  in  rather  striking 
agreement,  and  the  first  sort  of  conclusion  that  one  was  tempted  to  draw  was,  for 
example,  that  the  inheritance  of  pigmentation  was  strikingly  alike  for  all  pigments. 
But  the  agreement  was  in  some  cases  far  closer  than  one  is  accustomed  to  find  when 
one  compares  the  inheritance  of  directly  measurable  characters,  and  I  soon  became 
convinced  that  owing  to  some  important  theoretical  law  hitherto  overlooked,  the 
order  of  the  groups  by  which  we  classify  our  attributes  is  a  matter  of  no  importance 
when  we  are  determining  correlation.  The  group  order  is  all  important  for  variation, 
it  has  practically  no  influence  on  correlation.  We  may  put  sullen  tempers  where  we 
please  in  regard  to  quick  and  good-natured  ;  we  may  place  the  shades  of  red  hair  at 
either  end  of  the  hair  scale  or  in  the  middle,  and  the  inheritance  coefficient  will  come 

*  'Phil.  Trans.,'  A,  vol.  195,  pp.  1-47. 

t  For  example,  for  health  and  ability  and  for  the  correlation  of  the  psychical  and  physical  characters, 
see  the  "Fourth  Annual  Huxley  Lecture,"  '  Journal  of  the  Anthropological  Institute,'  vol.  33,  pp.  194-195. 
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out  nearly  the  same  in  value.  Nay,  we  may  go  further,  and  classify  finger  prints 
Uke  Mr.  Galton  into  "tents,"  "  arches,"  "whorls,"  "  croziers,"  &c.,  &c.,  and  still  be 
able  to  find  a  numerical  value  of  the  degree  of  resemblance  between  two  blood 
relatives,  although  any  arrangements  of  such  groups  into  a  possible  quantitative 
scale  may  be  inconceivable.  The  object  of  this  present  paper  is  to  deal  with  this 
novel  conception  of  what  I  have  termed  contingency,  and  to  see  its  relation  to  our 
older  notions  of  association  and  normal  correlation.  The  great  value  of  the  idea  ot 
contingency  for  economic,  social,  and  biometric  statistics  seems  to  me  to  He  ni  the 
fact  that  it  frees  us  from  the  need  of  determining  scales  before  classifying  our 
attributes.  I  shall  endeavour  to  illustrate  the  importance  of  this  freedom  in  the 
illustrations  which  follow  the  theoretical  treatment  of  the  subject. 

(2.)  On  the  Conception  of  Contingency. 

In  mathematical  treatises  on  algebra  a  definition  is  usually  given  of  independent 
probability.  If  p  be  the  probability  of  any  event,  and  q  the  probability  of  a  second 
event,  then  the  two  events  are  said  to  be  independent,  if  the  probability  of  the 
combined  event  be  p  X  <[.  Now  let  A  be  any  attribute  or  character  and  let  it  be 
classified  into  the  groups  A^,  Ao,  .  .  .  A^,  and  let  the  total  number  of  individuals 
examined  be  N,  and  let  the  numbers  which  fall  into  these  groups  be  ih^,  .  .  .  w,, 
respectively.  Then  the  probability  of  an  individual  falling  into  one  or  other  of  these 
groups  is  given  by  /^|/N,  "o/N,  .  .  .  »,/N  respectively.  Now  suppose  the  same 
population  to  be  classified  by  any  other  attribute  into  the  groups  B^,  Bn,  .  .  .  B/,  and 
the  group  frequencies  of  the  N  individuals  to  be  7y\,  m^,  .  .  .  nit  respectively.  The 
probability  of  an  individual  falling  into  these  groups  will  be  respectively  iUj/N,  Wo/N, 
mg/N,  .  .  .  WI//N.  Accordingly  the  number  of  combinations  of  B„  with  A„  to  be 
expected  on  the  tlieory  of  independent  probability  if  N  pairs  of  attributes  are 
examined  is 

N     N  ~    N  ~~ 

Let  the  number  actually  observed  be  n,i„.  Then,  alloAving  for  the  errors  of  random 
sampling, 

_  nuW„  __  _ 

niig  '  —  Km,  Vjiu 

is  the  deviation  from  independent  probability  in  the  occurrence  of  the  groups  A„,  B,.. 
Clearly  the  total  deviation  of  the  whole  classification  system  from  independent 
probability  must  be  some  function  of  the  —  v,„  quantities  for  the  whole  table.  I 
term  any  measure  of  the  total  deviation  of  the  classification  from  independent 
probability  a  measure  of  its  contingency.  Clearly  the  greater  the  contingency,  the 
greater  must  be  the  amount  of  association  or  of  correlation  between  the  two 
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attributes,  for  such  association  or  correlation  is  solely  a  measure  from  another 
standpoint  of  the  degree  of  deviation  from  independence  of  occurrence. 

Now  it  must  be  quite  cleai'  that  if  we  make  our  measurement  of  contingency  any 
function  whatever  of  such  quantities  as  —  v„„  its  magnitude  will  be  absolutely 
independent  of  the  order  of  classification,  i.e.,  its  value  will  be  unchanged  if  we 
re-arrange  the  A's  and  the  B's  in  any  manner  whatever.  This  is  the  fundamental 
gain  of  this  new  conception  of  contingency.  But  precisely  as  we  can  measure 
position  or  acceleration  in  a  great  variety  of  ways,  so  it  is  possible  to  measure 
contingency.  We  must  try  to  select  out  of  these  ways  those  which  :  (a)  brino- 
contingency  into  line  with  the  customary  notions  of  correlation  and  association ;  and 
(&)  permit  of  not  too  laborious  calculations  leading  to  the  required  measure. 

We  will  consider  these  points  at  some  length.  I  have  shown  in  a  paper,*  "  On 
Deviations  from  the  Probable  in  a  Correlated  System  of  Variables,"  that  if  m\, 
m/„,  .  .  .  m'n  be  any  system  of  observed  frequencies  and  vi^.  mo,  .  .  .  w.^^  be  any  system 
of  theoretical  frequencies  known  a  priori,  then  if 

^2  _  gum  1^^?  ^IkLl  from  q  =  Q  to  n 

be  calculated,  we  can  deduce  a  quantity  P  from  which  is  the  probability  that  in 
any  trial  a  system  m'\,  'ni"^,  •  ■  •  ''^"«  of  observed  frequencies  will  occur,  which 
deviates  more  from  m^,  m^,  .  .  .  m„,  than  the  actually  observed  system  does.  Tables 
have  been  worked  out  by  Mr.  Palin  Elderton  giving  the  value  of  P,  for  a 
considerable  range  of  values  of     and  n,  and  have  been  published  in  '  Biometrika.'t 

Now  it  will  be  obvious  that  if  we  want  to  measure  contingency,  we  really  want  to 
measure  the  deviation  of  the  observed  results  from  independent  probability,  and 
therefore  if  we  take  m^,  nio, .  .  .  m„  to  correspond  to  the  system  Vi,v  and  m'^,  m'^,  .  ■  ■  m'„ 
to  correspond  to  the  actually  observed  system  7i,a„ 

X~  =  S  l^'^""  ~  ^""^-l  (L), 

will  be  a  proper  quantity  to  calculate,  and  P  would  measure  how  far  the  observed 
system  is  or  is  not  compatible  with  a  basis  of  independent  probability.  If  P  be  large 
the  chances  are  in  favour  of  the  system  arising  from  independent  probability ;  if  P  be 
small  there  is  certainly  association  between  the  attributes.  Hence  1  —  P  would  be  a 
proper  measure  of  the  contingency.  1  propose  to  call  1  —  P  the  canting eyicy  grade. 
Further,  it  is  convenient  to  have  a  name  for  a  function  closely  related  to  x'.  I  shall 
call 

=  x7N  (ii-) 

the  niean  square  contingency. 


*  '  Phil.  Mag.,'  July,  1900,  pp.  157-175. 
t  Vol.  L,  p.  155. 
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It  will  be  seen  that,  in  the  method  by  which  we  liave  approached  the  problem,  we 
have  not  had  to  consider  the  question  of  the  sign  of  the  contingency  like  n„„  — 
our  mean  square  contingency  is  based  on  a  summation  of  squares  extending  to  all  the 
.s-  X  t  compartments  of  the  table.    But  if  we  treat  now  of  quantities  like       —  Vu,- 
their  total  sum  must  be  zero,  since  for  the  whole  table 

S(u„„)  =  N  ==^{v,„). 

Let  us  suppose  that  the  symbol  S  refers  to  a  summation  of  all  the  j)ositive 
contingencies,  and  let 

i/;  =  t{n,,„  -  v,„)l^  (iii.), 

then  \\!  shall  be  spoken  of  as  the  mean  contingency.  Clearly  an}^  functions  of  either 
<j)^  or  x^i  would  serve  to  measure  the  contingency.  We  shall  be  guided  in  our  choice 
of  such  functions  by  considering  what  are  the  values  of  and  i/;  in  the  case  of  normal 
correlation. 


(3.)  On  the  Relation  between  Mean  Square  Contingency  and  No7'mal  Correlation. 

Let  X  and  y  denote  the  deviations  from  their  respective  means  of  two  characters  or 
attributes,  of  which  cr.j.,  o-^  are  the  standard  deviations  and  r  is  the  correlation.  Then 
if  we  assume  a  normal  distribution  of  frequency,  Zq  By  would  be  the  frequency  ol 
individual  pairs  falling  between  x  and  x  +  Bx,  y  and  y  +  S^,  where 


N 

«o  =  e" 


on  the  assumption  of  independent  probability,  and  z  hx  Sy,  where 


 (iv.), 


T        T,  ^  e   ^l-r2V<r^   o-,<rj    ,t,/J  (V. ) 


on  the  assumption  of  contingent  probability. 
We  then  have  at  once 

,  ^  g  UzSx^~z,hxSyn  ^  g  s.^s  1 

and  we  have  only  to  insert  the  values  of  z  and  Zq,  given  by  (iv.)  and  (v.),  and  integrate 
all  over  the  plane  of  x,  y,  to  find  the  mean  square  contingency. 
Now,  if  ac  >  b^,  we  know  that 

^_na.^-2^,,.,  +  .,^)^^   £   (Vi) 

27rJ-J-»  ^      ^ac-b^  ^  ''' 
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Th'iH  is  all  A\  e  need,  for  if  ,r  =  a_,j'\  y  =  o",^?/  : 


Thus  the  mean  square  contingency  is  simply  r^l{\—r^).  Or, 

^  =  ±A/r|^2  (™)- 

Thus  the  relationship  between  mean  square  contingency  and  correlation  in  the  case 
of  normal  frequency  is  of  an  extremely  simple  character. 
We  see  at  once  :  — 

(i.)  That  since  the  mean  square  contingency  is  absolutely  independent  of  the 
arrangement  of  our  classes,  the  coefficient  of  correlation  is  also  entirely 
independent  of  the  arrangement  of  our  classes  on  the  basis  of  any  assumed 
order  or  scale. 

(ii.)  Provided  our  classes  are  sufficiently  small  to  allow  of  us  legitimately 
replacing  by  groupings  over  small  areas  the  theoretical  integrations,  the 
coefficient  of  correlation  can  be  found  from  the  mean  square  contingency. 

We  have  thus  an  entirely  new  method  of  finding  correlation  in  the  case  of 
quantitatively  non-measurable  characters.  It  assumes,  however,  that  our  classification- 
groups  are  sufficiently  numerous  and  their  contents  sufficiently  small  to  justify  us  in 
supposing  that  the  contingency  has  reached  a  definite  limit.  Clearly  in  working  m 
the  future  by  the  contingency  method,  we  shall  have  to  adopt  rather  more  numerous 
classes,  and  they  should  not  contain  too  irregular  proportions  of  individuals,  but  we 
can  then  afiord  to  drop  any  question  of  scale  or  order  of  grouping. 

It  may  be  asked  whether  this  method  of  deriving  the  correlation  from  the 
contingency  cannot  rejDlace  the  earlier  method  of  deducing  the  correlation  by  the 
fourfold  division  of  the  material.    The  answer  is  that  in  some  cases  it  can  do  so  very 


IIET.ATTON  TO  ASSOCIATION  AND  NORMAL  CORRELATION. 


9 


a(lvantao'eousl3%  hut  it  is  very  far  from  doing  so  in  all.  The  contingency  found  from 
a  fourfold  table  is  a  perfectly  real  and  ver}^  proper  measure  of  the  deviation  of  its 
material  from  independent  probability.  But  if  this  mean  square  contingency  be 
substituted  in  equation  (viii.).  it  will  not  give  us  the  correlation.  The  proper  mean 
squa.re  contingency  to  give  us  the  correlation  must  l^e  leased  on  a  sufficiently  large 
numljer  of  classes.  Wlien,  liowever,  A^^e  take,  say,  20  classes  for  each  attribute,  we 
have  400  terms  to  deal  with  in  calculating  cj)^,  and  although  the  result  might  then 
possibly  give  a  more  accurate  value  for  the  correlation  than  that  found  from  a  fourfold 
division,  yet  the  lal)om'  of  determining  it  is  far  greater  and  may  be  excessive. 
Further,  the  simple  classification  into  two  or  three  groups  may  be  all  we  are  able  to 
make  at  all,  or  all  we  can  conveniently  make.  Hence  the  new  conception  of 
contingency,  while  illuminating  the  whole  subject — especially  as  demonstrating  that 
the  correlation  is  independent  of  scale  or  grouping,  does  not  do  away  with  the  older 
method  of  the  fourfold  division.    I  propose  to  call  the  expression 


V- 


the  first  coefficient  of  contingency. 

We  note  that  with  small  enough  classes  the  coefficient  of  contingency  becomes  the 
coefficient  of  correlation.  Accordingly,  with  a  view  of  lessening  the  number  of 
coefficients  in  use,  I  adopt  the  following  convention  :  Any  expression  or  function  of 
either  the  mean  square  contingency  {(f)")' or  the  mean  contingency  (xjj)  (or  indeed  of 
a,ny  other  measiu'e  of  the  contignency),  which,  when  the  grouping  is  sufficiently  small, 
is  theoretically  equal  to  the  coefficient  of  correlation — on  the  hypothesis  of  normal 
frequency — shall  be  termed  a  coefficient  of  contingency.  All  such  coefficients  of 
contingency  must,  on  the  same  hypothesis,  become  equal  on  a  sufficiently  small 
grouping,  and  they  will  scarcely  differ  widely  from  each  other  when  the  frequency  is 
not  absolutely  normal  and  the  grouping  is  merely  moderately  small.  These  points 
will  be  illustrated  later. 

(4.)  On  the  Relation  of  Mean  Contingency  to  Normal  Correlation. 

A  great  deal  of  the  labour  of  finding  either  the  coefficient  of  contingency  or  the 
coefficient  of  correlation  by  the  method  of  mean  square  contingency  when  the  groups 
are  numerous,  depends  upon  the  squaring  of  the  contingencies  and  dividing  by  the 
frequency  to  be  expected  on  the  basis  of  independent  probabilities.  The  whole  of 
this  labour  is  escaped,  if  we  work  with  the  mean  contingency  instead  of  the  mean 
square  contingency ;  further,  since  in  this  case  we  only  sum  for  the  positive  con- 
tingencies, neglecting  the  negative,  we  have  usually  to  deal  with  only,  or  often  less 
than,  a  moiety  of  the  terms  involved  in  calculating  (j)''^.  On  the  other  hand,  there  is  no 
simple  relation  between  the  correlation  and  the  mean  contingency  such  as  we  have 
foimd  between  correlation  and  mean  square  contingency  in  equation  (viii.)  above. 

B 
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The  relation  is  far  more  complex  and  is  onl}-  expressible  in  the  form  of  integrals 
reducible  by  quadratures.  Still  for  practical  purposes  we  rarely  want  the  coefficient 
of  contingency  to  more  than  two  decimal  places.  Hence,  if  the  integral  be  evaluated 
for  the  coefficient  proceeding  by  equal  intervals,  we  can  plot  a  curve  giving  the  value 
of  the  coefficient  of  contingency  in  terms  of  the  mean  contingency,  and  this  will  be 
sufficiently  accurate  to  enable  us  to  read  off  the  former  in  terms  of  the  latter  to  the 
required  degree  of  accuracy.  The  enquiry  also  brings  out  some  other  points  not 
without  interest."* 

To  invest! yate  the  curve  ivhich  in  a  normal  correlation  surface  separates  on  the 
plane  of  xy  areas  of  positive  from  areas  of  negative  contingency. 

The  frequency  due  to  independent  probability  will  be  equal  to  that  due  to  the 
actual  contingent  probability  when 

N      -1(4+4)        N  1 

  (3       xcTx-i    <T,//   —  ______  \  —  r-  \crx-    (Ti<r,j  a;// 

27rcr.,.cr^  27ro-.,.cry         —  r"  . 

wdiere  r  is  the  coefficient  of  correlation,  or  of  contingency. 
Clearly 

(1  -r2)log.(l-r2)  =  -r"4^^--2i^-+  <l.    .    .    .  (ix.). 

[cr,r-      ra-rO-y  cr/J 

Since  r  is  always  less  than  unity,  this  curve  is  clearly  a  hyperbola,  which  possesses 
several  interestbig  projDerties.  We  see  at  once  that  all  the  contingency  of  one  sense 
is  grouped  into  the  space  between  the  two  branches  of  this  hyperbola,  and  that  the 
contingency  of  the  other  sense  is  grouped  into  the  two  separate  spaces  inside  the  two 
branches.  Thus  contingency  of  either  sense  is  for  normal  correlation  continuous,  and 
abrupt  changes  of  sign  in  the  contingency — beyond  the  limits  of  random  sampling — 
are  not  to  be  expected. 

By  testing  on  actual  correlation  tables  I  find  this  hyperbola  comes  out  in  a  fairly 
marked  manner,  in  fact,  quite  as  significantly  as  the  elliptic  cojitours  of  equal 
frequency. 

I  2)ropose  to  consider  the  properties  of  this  zero  contingency  hyperbola — it  forms 
the  curve  along  which  two  really  contingent  events  have  a  frequency  identical  with 
their  independent  probability. 

Consider  the  two  families  of  curves  : 

j^,  _  2«y  +  _<  =  (X.). 

i^L  _  2     .  +  j4  =  ;3  (xi.). 

*  I  have  to  heartily  thank  my  assistant,  Dr.  L.  N.  G.  Fn.ON,  for  the  sulistance  of  the  first  part  of  the 
investigation  given  below,  down  to  equation  (xiii.),  I  owe  the  calculation  and  plotting  of  the  curves 
%  =  ^.-KPeco     ^^jy  assistant,  Mr.  J.  C.  M.  Garnett. 
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Since  r  is  always  <  1,  the  a  family  form  a  set  ot  concentric,  similar,  and  similarly- 
situated  ellipses,  and  the  /3  family  a  set  of  concentric,  similar,  and  similarly-situated 
hyperbolas.  Any  conic  having  double  contact  with  the  hyperbola  /3o,  of  zero 
contingency  defined  by  (ix.),  at  the  ends  of  a  diameter  y  =  mx,  has  for  its  equation 

If  this  be  identical  with  an  ellipse  a,  we  have,  by  comparing  coefficients  and 
eliminating  X  and  m, 

Consequently  <^  =  i  ''An  the  sign  Ijeing  determined  from  the  fact  that  a  must 
always  be  positive  for  real  ellipses. 

Now  the  ordinate  z  of  the  normal  frequency  surface  is  given  by 

. .       f    2  ( 1  -  ,-2)  >  <r <r,,<r,    cr/  > 


N 


2TT(r,,cr,f  \/ 1  — 

and  to  find  the  mean  contingency  we  must  determine  the  whole  volume  lying  inside 
the  two  branches  of  the  above  hyperbola,  integrating  on  both  sides  of  the  line  of 
contact  of  the  families  of  hyperbolas  and  ellipses.* 

We  have        dx  dy  over  this  area 
where 

j^8(a,/3)^      4(1  -r^) /a;"-  y^\ 
8  [x,  y)  r(T^(Ty     Vo-/  cr// 

from  (x.)  and  (xi.). 

But  from  (x.)  and  (xi.) 

•       {(  "h,  +  ^Vf-  ^1  (I  -  r^f  -  (a^  -         (L  -  r^). 
Or,  choosing  the  signs  to  make  J  jjositive,  we  have 


*  The  ellipses  and  hyperbolas  have  common  pairs  of  conjugate  diameters ;  one  line  of  contact  is  one 
of  the  asymptotes  of  the  hyperljola  -    -  -'^  =  1  ;  and  tangents  at  an  intersection  point  of  any  of  the 

family  of  ellipses  with  any  of  the  family  of  hyperbolas  are  respectively  parallel  to  conjugate  diameters  of 
this  hyperbola.    These  geometrical  properties,  howe^'er,  need  not  detain  us  here. 
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Thus  the  required  integral  is 


477  ( 1  -  r^)  J        J  s„    ^  V'a^  -  ^ V 


 '     COS  ^'^^'^-e  2(1-,  ^)  c/a. 


277  (1  -r^)  J,., „  a 
To  simphly  put,  using  (ix.), 

a  =  /3,rsec^,    Z:  =  -^-^-^^^=-llog.(l-.^^^^     .    .    .  (xii.), 

where  h  will  always  be  positive,  since  r  <  1. 

We  have  "  _  ■ 

77J0 

or,  integrating  by  parts, 

-M    e-'^'-'de  (xiii.). 

77J0 


ir/2 

77J 


The  curves  u  —  e~'''"^''^  were  then  plotted  with  our  coordinatograj^h  for  a  series  of 
values  of  k  or  r  on  a  large  scale,  drawn  in  with  a  spline  and  integrated  with  a  Coradi 
compensating  planimeter.    The  values  of  I,,  resulting  are  tabled  on  p.  15. 

We  have  next  to  investigate  what  is  the  volume  NQ^.  of  the  surface  of  indeiJendent 
probability 

z  -    ^  c-K-.-^^:) 

which  falls  within  the  same  hyperbola  of  contingency.  We  shall  then  have  in  Q,.  —  I,, 
tlie  required  value  of  1//,  the  mean  contingency  on  the  basis  of  normal  correlation.  We 
have 

Q,.  —  — —  I  [  er^Av/'i})  dx  dy 
277cr.fCry  J  J 

taken  over  the  space  inside  the  two  branches  of  tlie  hyperbola 

af  _   2xy    ,    f  -  o 

Write  X  =  a'V.r,  y  =  ^Vy,  and  we  have 

-    ^  +  I)"  =  ySo- 
Transform  to  polars,  p  cos  6  =■  x',  p  sin  6  =  y\  ♦ 


RELATION  TO  ASSOCIATION  AND  NOKMAL  CORRELATION.  13 
This  shows  us  that  the  axes  are  given  by  ^  =     and  ^  +  ^  ,  or  are  a  and  h,  where 

a'  =  r^J{l  4  r),    ¥  =  rl3J{l  -  r). 
^    Take  these  axes  as  axes  of  coordinates.    Then  we  have  to  integrate 

Q,  =  1  jj  e-i(-'^+'/)  dx  dy, 
over  the  area  inside  one  branch  of  the  hyperbola 


Let  - 

X"  -\-      =  a, 


X 


(XV.), 


and  let  us  transfer  the  integrations  to  a  and  /3. 
We  have 

ff  =  ^{a  +  r{a-  13)], 

and 

over  one-half  one  branch  of  the  hyperbola. 

J  _dad^  _  (/a  d^  _8yx_i  /     _  a/^_~'Q\2 

Thus  we  have 

=  o^|l  f  '  "  -/  V— ^  ^   (xvi.). 

The  limits  are  obtained  from  tlie  consideration,  easily  seen  on  a  figure,  that  for  a 

1  +  r 

given  a.  we  must  integrate  from  /8  =       the  given  hyperbola,  to  /S  =  — ~     a,  the 

touching  hyperbola ;  and  then  for  a  we  must  take  every  circle  from  that  touching  yS^J, 
i.e.,  a.  =  r/3J{l  +  r)  up  to  infinity. 

We  will  first  integrate  with  regard  to  yS,  and  put 

1'  (a  —  /8)  =  —  «  sui  (f).- 

This  gives,  when  /3      (l      r)  a/r,  <^  =  Itt  ;  and  Al  lien 

/3  =  i3„    c^  =  shr''"i^'— =  . 
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Thus  we  find 

-1    (yS,,  —  a)  _,„ 


COS 


da 


a 


Take 
then 

Hence 


cos  X  =  r       -  a)/a, 


a  =  CO  ,  cos  X  —  — 

=  +         cosx  —  1. 

1:77 Jo  (r  4-  cos  Y)-^  ^ 


]^  feos-i(-'-) 
77  Jo 


observing  that  the  term  between  the  Umits  vanishes  at  botli. 
Take 

cos  9  —  {)•  +  cos  x)/(^'  + 

Then 

X  =  0,  0  =  0, 

X  =  cos-i(-r),  e-i 
Thus  we  find  finally,  after  some  reductions, 

where 


7^77. 


1  p^-.sce  ^ 
77  Jo 


1  +  COS  ^ 

e  +  COS  ^ 


e  =  {l-r)/{l+r), 


log,(l  -r^) 


-  1  +  /•  ^  r 

=  (1  —  r)  /v,  of  the  integral  I^.. 
Tables  Avere  now  formed  of  e  and  k  and  the  ordinates  of  the  curves 


V  =  e 


-  K  sec  B 


1  +  COS  ^ 

e  +  cos  6 


(xvii,). 


(xviii.), 


(xix.), 


(XX.) 


calculated.*  These  ordinates  were  plotted  on  a  large  scale  by  aid  of  a  Coradi 
coordinatograph  and  the  resulting  curves  integrated  as  before,  the  values  of  Q;.  thus 
found  are  given  with  the  values  of  1,  and  i/;  in  the  table  below.  1  believe  this  table 
gives  the  mean  contingency  in  tei'ms  of  the  correlation  true  to  at  least  three  places  of 
decimals.  The  u  and  v  curves  are  both  interesting  analytically  and  subject  to  rather 
curious  changes  of  type.    We  were  aided  in  jjlotting  them  by  calculating,  where 

*  I  owe  the  calculuLiou  uf  these  ordiiuite.s  to  Dr.  Alice  Lek. 
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needful,       and  '^(*  .    Finally,  the  values  of  r  were  plotted  by  my  demonstrator, 
'  cW         cW  i  J  J 

Mr.  L.  W.  Atcherley,  to  the  corresponding  values  of  i^-.    Thus  a  curve  was  obtained, 

which  enables  us  to  read  oft'  the  correlation  from  the  contingency  correct  to  at  least 

two  places  of  decimals — sufticient  for  nearly  all  practical  purposes. 

Table  I. — Table  of  Integrals  I,,  Q,.,  and  the  Contingency  \jj  for  Values  of  r. 


r. 

I,. 

0-00 

•  5000 

■  5000 

•0000 

•05 

•4620 

■4762 

■0142 

•10 

•4342 

■4652 

■0310 

•20 

•  3895 

■4536 

■0641 

■30 

•3501 

■4498 

■0996 

•40 

■3162 

■4547 

■1385 

•50 

•  28.30 

■4643 

■1813 

•60 

■2489 

■4814 

■2.325 

•70 

■2128 

•5106 

•2978 

•80 

■1700 

■5524 

•  3824 

•90 

■1180 

■6279 

•5093 

•95 

■0796 

■7009 

•6213 

1^00 

■0000 

1-0000 

1 ■ 0000 

Diagram  I.  at  the  end  of  this  memoir  will  therefore  serve  for  most  purposes  of 
interpolation,  and  it  will  be  seen  that  now  that  the'  integrals  have  been  evaluated  and 
the  diagram  constructed,  the  correlation  can  be  very  easily  found  from  mean  con- 
tingency. But  the  method  seems  to  me  distinctly  inferior  to  that  of  mean  square 
contingency,  and  this  for  much  the  same  reasons  that  mean  error  calculations  are 
inferior  to  mean  square  error  work  in  curve  fitting.  Further,  the  grade  of  contingency 
can  be  found  at  once  from  a  knowledge  of  mean  square  contingency,  and  whatever  be 
the  distribution  is  a  significant  and  interpretable  constant.  This  is  only  true  of  the 
correlation  deduced  from  mean  contingency  if  the  distribution  be  normal. 

(5.)  To  sum  up  our  results  so  far  : — 

We  have,  if 

n„„  be  the  actual  frequency  of  a  group  in  the  population,  N  which  combines  the 
characters  A„  and  B„,  v„„  be  the  frequency  of  this  group  on  the  hypothesis  of 
independent  probability,  then 

n,n,  —  v„,  is  simply  a  sub-contingency, 

{n„„  —  v„,y  ■ 


S 


S 


Vuv 


N 


=      may  be  termed  the  square  contingency, 
=      is  the  mean  square  contingency, 

=  i//,  where  t,  is  the  sum  for  positive  (or  negative)  sub- 
contingencies  only,  is  the  mean  contingency. 
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Any  one  of  tliese  expressions  is  a  measure  of  the  deviation  of  the  system  from 
independent  probabihty,  and  therefore  of  the  amount  of  association  or  correlation 
between  tlie  characters  or  attributes  involved.  But  any  function  of  these  expressions 
is  also  a  proper  measiu-e.    Such  functions  are  : — 

(a.)  The  contingency  grade.  This  is  1  —  P,  where  P  is  to  be  found  from  by  aid 
of  the  tables  for  "  goodness  of  fit."    See  '  Biometrika,'  vol.  1,  pp.  1.55,  et  se<j. 

(b.)  The  mean  square  contingency  coefficient  =  C,,  where 

=  Vi^.   (x-xi.). 

(c.)  The  menu  contingency  coefficient  =  C^,  where  is  to  be  found  from  the  table 
on  p.  1  5  or  from  Diagram  T.  at  the  end  of  this  memoir. 

In  the  case  of  sufficiently  small  grouping  and  normal  correlation  we  have 

C|  =  Co  =  coefficient  of  correlation. 

But  it  must  not  he  forgotten  that  this  is  essentially  a  limiting,  not  a  general  case. 
Nevertheless  the  approach  to  equality  of  the  two  contingency  coefficients  will  be  a 
good  measure  of  the  normality  of  the  distribution  and  the  suitability  as  to  smallness 
of  our  elements  of  grouping. 

(6.)  A  little  experience  of  actual  working,  liowever,  shows  that  in  practice  it  is 
perfectly  easy  to  overshoot  tlie  mark  in  fineness  of  grouping.  Suppose  that  in 
dealing  with  1000  cattle  we  find  a  single  instance  of  a  calf  inscribed  as  "mulberry," 
say  the  offspring  of  a  red  cow  by  a  dark  fawn  l^ull.  Now  if  there  be  30  dark  fawn 
bulls,  the  independent  probability  of  a  dark  fawn  bull  having  a  mulberry  offspring 
is  "03.  Hence  the  sub -contingency  for  a  <S  parent-offspring  table  =  1  —  -03  =  '97, 
and  the  corresjjonding  contribution  to  the  squa.re  contingency  will  be  (■97)~/03,  or 
is  upwards  of  31.  The  fact  is,  that  when  we  come  to  very  fine  groupings  we  get  at 
once  into  difficulties  owing  to  our  having  to  record  by  units  only.  Suppose 
"mulberry"  calves  actually  had  no  relation  to  any  special  parentage,  but  were  rare 
anomalies  occurring  once  among  1000  calves,  or  perhaps  were  merely  an  odd  breeder's 
fancy  description,  then  a  unit  cannot  be  divided  in  the  proportions  of  the  colour 
parentage,  it  must  fall  into  some  one  colour  parentage  group.  The  result  is 
that  a  few  isolated  individuals  will  give  large  contributions  to  the  mean  square 
contingency.  The  above  example  is  purely  hypothetical,  but  similar  cases  have 
actually  occurred  in  dealing  with  colour  problems  by  the  contingency  method.  They 
are  exactly  similar  to  those  which  occur  when  dealing  with  outlying  individuals  by 
the  test  for  "  goodness  of  fit."  In  a  frequeiicy  distribution  we  proceed  only  by  units, 
but  the  theory  gives  fractional  values  of  the  frequency  ;  hence  in  forming  the  value  of 
X'  to  measure  goodness  of  fit,  one  or  two  unit  "  outliers,"  although  not  improbable  as 
far  as  the  wJiole  of  the  tail  of  a  curve  is  concerned,  may  be  exceedingly  improbable  if 
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considered  from  the  standpoint  of  the  actual  group  in  which  they  do  occur.  This 
point  must  be  carefully  borne  in  mind  in  actual  practice,  for  by  sufficient  refinement 
of  grouping,  i.e.,  till  we  reduce  certain  groups  to  a  single  individual  or  two,  the  mean 
square  contingency  can  be  increased  in  a  remarkable  manner. 

(7.)  Of  course  this  is  merely  saying  that  the  probable  errors  of  the  sub- 
contingencies  increase  largely  when  we  make  v„„  very  small.  Unfortunately  I  have 
not  yet  succeeded  in  determining  the  probable  errors  of  the  contingency  coefficients. 
If  c^y  be  the  contingency,  determined  by 

_       _  n^, 

and  Sr„.  its  standard  deviation  for  random  sampling,  I  find 

=  n,,,.  ( 1  -  g)  +        (n„  +  n„  -  -^^)  -  2  -g  (n,  +  n,.  -  -^-"j.   . (xxn.), 

so  that  the  probable  error  of  any  individual  contingency  =  "67449       is  determined. 
Further,  if  'Rc„,c„\.'  be  the  correlation  between  errors  due  to  random  sampling  in  two 
contingencies  c,„,  and  c„,.„,,  not  belonging  to  either  the  same  row  or  column, 

—  ~  ""IT"  w  

Similarly  we  find  for  the  correlation  of  errors  of  two  contingencies  of  the  same 
column,  Re„„c,„,,  the  result 

+   (-iv.), 

and  for  errors  of  two  contingencies  of  the  same  row, 

'^^^v~w)  

Results  (xxii.)  to  (xxv.)  enable  us  to  find  the  probable  errors  and  the  error 
correlations  for  any  individual  contingencies  which  will  arise  from  random  sampling, 
and  are  so  far  of  value  ;  but  when  we  attempt  to  find  the  general  expression  for  the 
probable  error  of  either  the  mean  or  mean  square  contingency,  it  becomes  so  complex 

c 
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that  there  appears  little  hope  of  deducing  a  simple  result.  Arithmetically  the 
problem  might  be  solved  at  the  expense  of  rather  troublesome  numerical  calculations 
if  the  number  of  sub-groups  was  not  very  large.  A  general  and  simple  expression  for 
the  probable  error  of  i//  or  involving  xfj  or  (f>~  only  does  not  appear  likely  to  exist, 
and  an  expression  involving  all  the  sulj-group  frequencies  would  be  very  troublesome 
for  computation.  Practically  the  errors  of  the  contingency  coefficients  may  be  fairly 
reasonably  taken  to  lie  between  the  probable  errors  of  r  as  found  by  a  fourfold 
division  of  a  table  and  by  the  product  method,  approaching  the  latter  more  closely  as 
the  number  of  sub-groups  is  sufficiently  increased.  With  the  experience  of  probable 
errors  of  fourfold  tables  before  us  we  may,  I  think,  safely  take  the  probable  error  of  a 
contingency  coefficient  C  for  rough  judgments  to  be  less  than 

1  — 

2  X  -67449 


\/  n 

i.e.,  double  the  probable  error  of  a  correlation  coefficient  found  from  the  product 
moment.  At  the  same  time  we  must  distinctly  be  cautious,  remembering  the  difficulty 
as  to  isolated  units  referred  to  in  the  previous  section. 

We  may  look  at  the  probable  error  of  the  contingency  from  another  standpoint. 

Taking  the  mean  squared  contingency,  we  have 


Therefore 


and  accordingly,  if  t^,,  tr  be  the  standard  deviations  in  errors  of     and  r, 

2r    ^  2r      1  —  * 


2 


Hence  if  we  were  to  determine  from  r,  the  probable  error  of  (j)^  would  be 
given  by 


Probable  error  of  <j)"  =  '67449  -J^  ^{l  +  <^2) 


Or,  we  can  put  it  into  the  more  useful  form, 


Percentage  probable  error  of      =  ^^^^^  /yX^—^^a^     '    "  (x^"^^-)- 

Thus  the  percentage  probable  error  increases  rapidly  as  the  contingency  gets  smaller. 

*  'Phil.  Trans.,'  A,  vol.  191,  p.  242. 
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Of  course,  the  probable  error  of  as  found  from  r  is  not  necessarily  the  same  as 
the  probable  error  of  ^~  found  directly,  but  it  may  serve  as  a  guide  to  its  approximate 
value. 

If  it  were  the  same,  the  probable  error  of  r  as  found  from  would  be 
•67449/ {{1  +  (/)^)  v/N},  a  result,  as  indicated  in  the  previous  paragraph,  much  too 
small,  except  possibly  for  very  successful  systems  of  grouping. 

(8.)  To  find  under  what  other  condition  than  normal  correlation  small  changes  in 
the  order  of  grouping  will  not  affect  the  value  of  the  correlation. 

Let  us  assume  the  unit  of  grouping  to  be  very  small,  but  not  necessarily  the  same 
for  all  groups.  Let  the  two  characters  or  attributes  be  x  and  y,  and  suppose  n,  to 
be  the  total  frequency  of  individuals  in  the  range  ys  —  e  to  y,  +  e,  and  n,^^  to  be  the 
total  frequency  in  the  range  i  —  e'  to  ys+i  +  e'.  Let  y;^Y  —  Ih  =  e  +  e'  =  be  so 
small  that  its  square  may  be  neglected.  Let  x,  y  be  the  mean  values  of  the 
characters,  N  the  total  frequency.  We  will  find  the  changes  in  the  moments  and 
constants  supposing  the  array     and  Ug+i  interchanged  in  position. 

Clearly      =  0  and  Scr^  =  0. 

N  {y  +  S?/)  =  S  {y,n,)  +  h  {n,  — 


or. 


Next  if 


hj  =  h  {n,  —  N. 
N  (o-^  4-  So-^Y  =  S  (yrn,)  +  2h  {y,n,  —  ys+^n,^^)  —  N  (y  +  Sy)",* 
2cr„  Bo-y  =  2h  {y,n,  —  y,+^7i,^^)  —  2^ySy, 

So-//  _      {Vs  —  y)  ns  —  (y.+i  —  y) 

a,       o-/  N 

P  =  S  (xy)  -  mjx, 
P  +  SP  =  S  (xy)  +  h  {n,x,  —  tis+^x,^^)  —  Ni/ic  —  Na?  By, 


or, 


BF  =  h  (w,  {x,  —  x)  —  n,+i         —  x)], 

where  x,-  and  x^+i  are  the  means  of  the  arrays  n^.  and  w^+j. 
But  if  7'  be  the  correlation  coeflScient  of  x  and  y  characters. 

Therefore 

  8P  B(Tr  ScTy 

r        P        cr,r  a-y 

*  It  must  be  noted  here  that  the  squares  of  the  change  in  y  and  a-y  are  neglected.  Hence  the  changes 
must  not  be  so  great  that  8y  and  ^o-y  are  sensibly  as  compared  with  y  and  a-y. 

c  2 
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and  substituting  the  above  values, 

Sr  _  ^  f  71,  (x,  -  a;)  -  7i,^^  {x,^^  -  x)  _      -  y)  n,  -  {y,^^  -  y) 
r  \  P  No-/ 

If  this  is  to  vanish  for  any  value  of  s  and  li,  it  will  be  sufficient,  since 

(Ty 

Or,  if  the  mean      of  any  y„,-array  of  individuals  be  determined  by 

—  a;  =  '     (//,„  —  y). 

But  this  is  the  condition  for  linear  regression. 

Hence  we  conclude  that  in  any  correlated  system  of  variables,  obeying  the  law  of 
linear  regression,  we  can,  without  sensibly  modifying  the  correlation,  interchange  two 
adjacent  ^/-arrays  {e.g.,  two  rows  of  the  correlation  table),  provided  the  grouping  be 
fine.  But  if  we  can  interchange  any  two  adjacent  y-arrays,  we  can,  l^y  a  repetition 
of  such  changes,  interchange  any  two  ^/-arrays  whatever ;  and  a  precisely  similar 
statement  must  be  valid  for  any  two  x-arrays  {e.g.,  two  colunnis  of  the  correlation 
table).  Hence,  given  a  sufficiently  small  system  of  grouping,  we  may  state  that  in  all 
cases  of  linear  regression  the  actual  order  of  the  scales  is  immaterial  as  far  as  the 
determination  of  the  correlation  is  concerned. 

The  practical  importance  of  this  result  would  appear  to  be  great,  for  it  frees  us 
when  dealing  with  scale  orders  from  tlie  need  for  supposing  normal  frequency  ;  the 
indifference  of  the  scale  order  when  determining  correlation  is  still  true,  provided  the 
regression  is  linear ;  and  this  liiieaiity  of  regression  is  not  only  found  from  observation 
to  be  very  general — for  example,  in  inheritance  problems*" — but  follows  from  theory 
itself  in  the  case  of  various  hypotheses,  t 

In  actual  practice,  of  course,  the  degree  of  fineness  of  the  grouping  is  limited  by 
many  considerations,  and  hence  it  will  often  be  better  to  proceed  by  the  fourfold 
division  method,  taking  that  division  where  possible  at  a  very  distinct  classification. 
But  the  general  principle  now  demonstrated  will  enable  us  in  future  to  pay  much  less 

*  See  "  The  Laws  of  Inheritance  in  Man. — I.  Inheritance  of  the  Physical  Characters,"  '  Biometrika,' 
vol.  2,  pp.  362-3 ;  also  "  Inheritance  of  Mental  and  Moral  Characters  in  Man,"  '  Huxley  Memorial 
Lecture,'  1903.    'Journal  of  the  Anthropological  Institute,'  vol.  33,  pp.  185-7. 

t  "Contributions  to  the  Theory  of  Evolution. — XII.  On  a  Generalised  Theory  of  Alternative 
Inheiitance,  with  special  reference  to  Mendel's  Laws."    'Phil.  Trans.,'  A,  vol.  203,  p.  85. 
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attention  to  the  actual  order  chosen  for  the  scales  if  we  are  dealing  with  a  class  of 
characters  for  which  we  may  reasonably  presume  the  regression  to  be  sensibly  linear. 

(9.)  If  we  take  the  crudest  possible  division  of  our  material  into  only  four  groups, 
thus  : — 


a 

c 

a  +  c, 

il 

h 

d  +  h 

a  +  d 

r  +  h 

N 

corresponding  to  what  Mr.  Yule  has  termed  the  association  of  two  attributes,  we 
have  at  once 

:>  {ah  -  cd) 


(ah  —  cd)~ 
{a  +  d)  {c  +  />)  (a  +  <■)  {d  +  h) 


(xxvii.), 
(xxviii.). 


Now  it  is  clear  that  in  this  case  (f)~  reduces  to  /v,^.-^,  where  r/,^.  is  the  correlation 
between  errors  in  the  position  of  the  means  of  the  two  characters  under  consideration, 
as  determined  by  a  fourfold  table,  and  ^xjj  is  in  this  simple  case  what  1  have  defined 
as  the  transfer  per  unit  of  total  frequency.*  Both  are  expressions  intimately 
connected  with  the  conception  of  association,  and  have  already  been  discussed  in 
relation  to  it.f  The  coefficients,  and  Co,  of  contingency — either  of  which  might 
serve  as  a  measure  of  the  association — will  not  in  this  simple  case,  however,  be 
necessarily  even  approximately  equal  to  each  other,  still  less  to  either  the  coefficient 
of  correlation  or  Mr.  Yule's  coefficient  of  association. | 

It  is  worth  while  illustrathig  this  on  a  numerical  example.  Taking  the  small-pox 
returns  for  the  epidemic  of  1890,  we  have  : — 


Cicatrix. 

Recoveries. 

Deaths. 

Totals. 

Present  . 

1562 

42 

1604 

Absent    .    .  . 

383 

94 

477 

Totals.    .    .  . 

1945 

136 

2081 

*  'Phil.  Trans.,'  A,  vol.  195,  pp.  12  and  14. 

t  Ibid.,  p.  15  ef  seq. 

I  'Phil.  Trans.,'  A,  vol.  194,  p.  272. 
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These  give  us  (/>-  =  "0845,      =  17576,  xjj  =  -0604.    From  these  we  tiud 

Ci  =  -279,        a  =  -190. 

Yule's  coefficient  of  association  =  '803. 

Coefficient  of  correlation  by  fourfold  division  =  •595. 

Grade  of  contingency  =  1  —  P,*  where  P  =  718/10*". 

Now  so  far  as  numerical  values  go  these  things  are  all  totally  diffi^rent.  Cj,  C2, 
and  the  coefficient  of  association  depend  very  largely  on  where  the  fourfold  division  is 
taken,  t  It  is  extremely  difficult  to  use  them  therefore  for  comparative  purposes.  On 
the  other  hand,  the  coefficient  of  correlation  with  the  assumption,  however,  oi' 
normality  is  free  of  this  restriction  ;  it  ])rings  us  into  line  with  other  things  for 
comparative  purposes.  The  grade  of  contingency  is  also  independent  in  a  sense  of 
the  division,  i.e.,  it  has  a  definite  physical  meaning.  What  it  tells  us  is  this,  that  the 
deviation  from  independent  probability  in  the  relation  between  result,  a  case  of 
small-pox  and  presence  or  absence  of  cicatrix  is  such  that  the  above  table  could  only 
arise  718  times  in  10^"  cases  if  the  two  events  were  absolutely  independent. 

If,  instead  of  a  table  like  the  al)Ove,  we  take  a  number  of  alternative  possibilities 
for  each  attribute,  the  coefficient  of  association  loses  its  uniqueness  of  meaning ; 

and  Co  still  retain  their  significance,  and  as  the  number  of  alternatives  become 
greater,  merge  in  the  coefficient  of  correlation.  The  grade  of  contingency,  on  the  other 
hand,  retains  the  same  perfectly  definite  meaning  tlu'oughout.  I  think  this  statement 
may  serve  as  some  wai'uing  of  the  caution  needful  in  using  the  coefficients  now 
introduced.  The  degree  of  approach  of  both  C,  and  Cn  to  the  correlation  must  be 
studied  for  each  special  class  of  cases,  and  only  when  this  has  been  done  will  their 
use  be  really  legitimate  and  eflPective. 

(10.)  Oil  tJic  Relation  hetiveen  Multiple  CohtiiKjency  and  Multiple  Normal 

Correlation. 

Suppose  instead  of  a  single  correlation  table  we  have  a  multiple  correlation  system. 
Such  a  system  is  well  illustrated  by  the  cabinet  at  Scotland  Yard,  which  contains  the 
measurements  of  habitual  criminals  on  the  old  system  of  body  measurements,  now 
discarded  in  favour  of  a  finger-print  index.  We  have  in  this  case  a  division  of  the 
cabinet  into  3  compartments,  which  mark  a  threefold  division  of  long,  medium,  and 

*  When  the  immlier  of  gtoups  =  4,  we  have  ('Phil.  Mag.,'  vol.  50,  p.  157  d  xeq.)  :— 

^     IT  }X 


whence  P  is  easily  found  if  ^-  be  large, 
t  Yule,  loc.  cit.,  p.  276. 
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short  head  lengths.  Each  of  these  vertical  divisions  is  then  sub-divided  horizontally 
into  three  divisions  giving  the  corresponding  divisions  for  head  breadth  ;  each  of  these 
head-breadth  divisions  has  three  drawers  for  large,  moderate,  and  small  face  breadths. 
Each  drawer  is  sub-divided  into  three  sections  for  three  finger  groups,  and  these  again 
into  compartments  for  cubit  groups,  and  so  on.  If  this  be  carried  out  for  the  seven 
characters  dealt  with,  we  should  have  ultimately  3'''  sub-groups  forming  a  multiple 
correlation  system  of  the  7"'  order.*  We  may  ask  what  is  the  mean  square 
contingency  of  such  a  system  and  to  what  extent  does  it  diverge  from  an  independent 
probability  system  ?  Of  course,  for  an  ideal  anthropometric  index  system  the 
divergence  should  be  very  slight. 

Let  «i,  x^.  .  .  x„  be  the  n  variables  of  a  multiple  normal  coii'elation  surface,  to  which 
the  equation  is 


^PP  ^P^  \   +  282  (  ^^^^ 

pi  -  R  O'^CTy 


Here  o-j,  cto  .  .  .  o-„  are  the  standard  deviations  of  the  n  variables ;  denotes  a  sum 
of  all  values  of  p  from  1  to  n,  S.-,  a  sum  of  all  unlike  values  of  p  and  q  from  1  to  'i  ; 
while  R  is  the  determinant 


1  , 

^'18  .      •  ■ 

1  , 

^31  ' 

1   ,      .  . 

■^3« 

n,.3 .  •  • 

.      .  1 

and  R,/  is  the  minor  corresponding  to  the  constituent  r,/,  and  the  rs  are  the 
correlation  coefficients,  f 

Now  if  (fy^  be  the  mean  square  contingency,  we  have 

~  M  f     f     f    •  •  •  f        ~       c^a^i  dx^'  ■  '  dx„, 
where  2^  —  value  of  z  when  all  the  /*'s  are  zero,  or 

{^Try  o-iCTj  ...  a-,,  L     \a>  /  J 

Thus  we  have,  writing  Xp  —  a-pxfp,  etc., 

= (2^-  J  L I ■  ■  ■  j (i  -  -'^ + ^»))     ■  ■  ■ 

*  See  Macdonell,  "On  Criminal  Anthropometry,"  ' Biometrika,'  vol.  1,  p.  205  c/ 
t  'Phil.  Trans.,'  A,  vol.  187,  p.  302,  or  lUd.,  A,  vol.  200,  pp.  3-8. 
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where 

C  =  j-j^expt.  -  j{s,(^^.'/)  +  2S,(|».x)}  , 
£,  =  expt.  _ijS,(.r'/);.. 

r  +  x  r  +  oo        r  +  ai 

=  (27r)iVv/A  


Now 


where 


(xxix.), 


^12! 

^21) 

C22, 

^23) 

^2',» 

^32; 

^33* 

C-,)2j 

<-')i3) 

We  are  now  in  a  position  to  find  all  the  integrals  involved  in  the  equation  for  <f)~, 
we  have 


where 


1  1 

-2  +  1=. 

1 

1, 

Rv/a' 

Rv/A' 

R 

1, 

2R12 

R  ' 

2R,3 
R  ' 

•       •  • 

2R.„ 
R 

2R2, 
R 

2R22  1 
R 

2R23 
R  ' 

•       •  • 

2R2„ 

R 

2R3, 
R 

2R32 
R  ' 

2R33  1 
R 

2R3„ 

R 

2R,a 

R 

> 

2R.2 
R  ' 

2R„, 
R  ' 

2R„„ 
R 

To  evaluate  this  determinant,  we  notice  that  since  =  1,  M^e  have,  if  p  and  q  be 
dillerent, 


Hence 


2Ry 

R 


R  ^'^  ^    R  •  •  ■  ^  I  ^ 


2R. 


2R, 
R 


and 
2IL1 


2R„.-) 


2R„ 


2R,, 
R 
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Now  multiply  the  determinant  A'  by  the  determinant  Pv,  we  find,  using  the  above 
relations, 


1)  ■^125  ^'l3: 

—  r.u         1,  — 

■'".315  '"32?  1) 


—  ^'nl,       —  ''.2,       —  >"„3, 


=  IV,  say. 

Here  IV  is  K  with  the  sign  of  all  the  correlations  changed.    Hence  it  follows  that 


(xxx.). 


Special  Cases. 

(i. )  ■  Simple  correlation 

V\!  =  II  =  1  —  rjo^,    and        =  r^.-^/il  —  r^.f),  as  before, 
(ii.)  Triple  correlation 

E  =  1  -  r..,-  -  -  /-J.-  +  2r.37-3i7-i2, 
R'  =  1       r.y^      r.^y-      }•^.■,•  '^''■2ii^'3\'''i-2- 


(iii.)  Quadruple  correlation 


—  I. 


and  so  on. 

Clearly  a  condition  has  to  be  satisfied  among  the  correlation  coefficients,  or  the 
process  by  which  we  have  deduced  <^''  is  not  legitimate.  We  must  have  A  positive  for 
equation  (xxix.)  to  be  true.  Now,  for  normal  correlation  R  must  be  real  and  positive, 
or  the  equation  to  the  multiple  correlation  surfaces  become  imaginary.  Hence  it 
follows  that  A'  must  be  positive,  and  therefore  R'  must  be  positive.  This  seems  to 
give  a  definite  condition  to  be  satisfied  by  the  correlation  coefficients,  and  in  some 
cases  rather  narrow  limits  are  enforced.  For  example,  in  the  case  of  triple  correlation 
we  must  have 

positive,  and  this  appears  to  reduce  very  considerably  the  possible  values  for  the 

D 
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correlationship  of  three  characters.*  The  source  of  this  novel  condition  appears  to  he 
in  the  integration  of  the  term  I^Hq,  and  this  is  only  possible  by  use  of  equation  (i.), 
provided  the  surface  Z  =  C/Iq  has  "  ellipsoidal"  contours.  If  it  has  not,  we  may  get 
the  subject  of  integration  becoming  inlinite  with  one  or  other  of  the  xb,  and 
consequently,  although  both  t,  and  vanish  at  co,  ^y^,,  may  not  do  so,  i.e.,  the  mean 
sqviare  contingency  tends  in  certain  tracks  to  become  indefinitely  large.  In  fact,  our 
method  of  deducing  multiple  contingency  from  the  normal  correlation  coefficients  is  only 
valid  provided  the  system  is  not  only  a  possible  correlation  system  with  the  given  values 
of  the  coefficients,  but  also  when  these  coefficients  all  have  their  signs  reversed. 

(11.)  Illustrations.    A. — Stature  in  Father  and  Son. 

Table  II.  gives  the  distribution  of  1078  cases  of  stature  in  father  and  son.f  The 
correlation  r,  as  found  from  the  product  moment  in  the  usual  way,  is  '514. 

I  propose  to  consider  the  approach  of  Cj  and  Co  to  r  as  we  increase  the  fineness  of 
the  grouping.  Clearly  it  would  involve  extreme  labour  to  work  out  the  contingencies — 
especially  the  mean  square  contingency — for  the  table  as  it  stands. 

To  begin  with  I  classed  in  three  inch  groups  and  got  the  following  table,  in  which 
the  figures  in  brackets  are  the  independent  probabilities. 


Table  III. — Stature  of  Father  and  Son  in  Inches. 


Stature  of  Father. 

1 — 1 

id 

id 
o 

id 

00 

id 

CO 

Totals. 

Chances. 

CO 

i 

CO 

CD 

1^ 

lO 

IC 

1 

1 

lO 

1 

m 

1 

lO 

00 
lO 

1 — 1 

CO 

CO 

CO 

o 

Jtr~ 

CO 

58-5-61 

5 

1-5 

2 

3-5 

•0032 

61-5^64 

5 

(-05) 
3-5 

(-36) 
19 

(1-20) 
33 

(1-32) 
5-5 

(■50) 
1-5 

(-0.3) 

62-5 

-0580 

d 

64-5-67 

5 

(-84) 
8-5 

(6-50) 
53-75 

(21-75) 
148 

(23-87) 
80-5 

(9-02) 
8-25 

(-55) 

299 

-2774 

o 
o 

67-5-70 

5 

(4-02) 
2-5 

(31-07) 
33-25 

(104-03) 
149-25 

(114-15) 
202-25 

(43-14) 
60  •  25 

(2-64) 
3-5 

451 

-4184 

ture 

70-5-73 

5 

(6-07) 

(46-86) 
3-5 

(156-90) 
39-75 

(172-17) 
104-25 

(65-06) 
62 

(3-97) 
3-5 

213 

-1976 

02 

73-5-76 

5 

(2-87) 

(22-13) 
1 

(74-10) 

3 

(81-31) 
14-5 

(30-73) 
20-5 

(1-88) 
2-5 

41-5 

•0385 

76-5-79 

5 

(-56) 

(4-31) 

(14-44) 

(15-84) 
4-5 

(5 ■ 99) 
3 

(-37) 

7-5 

-0069 

(-10) 

(-77) 

(2-59) 

(2-84) 

(1-07) 

(-07) 

Totals  .  . 

14-5 

112 

375 

411-5 

155-5 

9-5 

1078 

roooo 

*  For  example,  if  -  5  be  the  value  of  parental  correlation,  then  the  correlation  of  two  brothers  could  not 
exceed  •  5  without  making  R'  negative, 
t  See  '  Biometrika,'  vol  2,  p.  415, 
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The  independent  probabilities  were  found  by  multiplying  the  "  chances  "  of  a  son 
occurring  in  each  group  hy  the  totals  for  each  group  of  fathers.  Taking  the  diiference 
of  the  observed  sub-group  frequencies  and  the  independent  probability  frequencies,  we 
have  N  X  t//  =  205 '62  from  the  positive  and  =  —  205-66  from  the  negative  differences, 
a  quite  good  agreement.    Hence  we  find  xjj  =  "IGOB. 

Using  Diagram  I.  we  have 

Co  =  -522. 

Proceeding  now  to  the  mean  square  contingency  obtained  by  squaring  all  the  above 
found  contingencies,  dividing  each  by  the  independent  probability  frequency  and 
summing,  we  find 

(f)"  =  -2755, 

whence 

Ci  =  -465. 

The  value  of  is  clearly  too  small.  We  must  infer  that  our  grouping  was  not 
fine  enough.  Accordingly  in  Table  IV.  I  have  re-arranged  the  matter  in  2-inch 
groupings,  and  have  then  in  the  same  manner  proceeded  to  find  xjj  and  (f)^.  In  this 
case  I  found  xfj  =  "2013,  and  thus 

a  =  -542, 

while 

_  -3568, 

and 

Ci  =  -513. 

I  thus  conclude  that  the  grouping  is  now  fine  enough  to  give  and  C„ 
approximately  equal  to  the  correlation.* 


*  i.e.,  within  the  probable  error  of  that  result. 
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Stature  of  Father. 


U5 

■* 

CD 

i> 

in 

in 

in 

CD 

Totals. 

Clianees. 

o 

CO 

N 

CD 

CO 
CD 

CO 

o 
i> 

Js 

in 

in 

lO 

in 

.n 

in 

00 

o 

•.^ 

CO 

00 

CD 

CO 

CD 

CD 

CO 

i> 

1> 

59  -5-61  -5 

1  •s 

1 

1 

3  5 

•00325 

(-02) 

(  -08) 

(•31) 

(•77) 

(-95) 

(•84) 

(•41) 

(ai) 

(•02) 

61  •5-63  "5 

•5 

2  "75 

o  75 

9  "5 

5 

•25 

'25 

24 

•02226 

( -l  l) 

(-56) 

(2  •ll) 

(5  -29) 

(6  -49) 

(5  ^73) 

(2  ^83) 
1  -25 

(•72) 

(  ^12) 

63  -5-65  -5 

4 

7-75 

20 

41  ^5 

17  ^25 

8-25 

100 

•09276 

(  -60) 

2 

(2  -32) 

(8  -81) 

(22  -03) 

(27  -04) 
78  -75 

(23  -89) 

(11  ^78) 

(3  •Ol) 

( •SI) 

65  -5-67  -5 

10 

32 

73 

33  -5 

7  ^25 

1 

237  -5 

•22032 

o 

(1-43) 

(5  -51) 

(20  -93) 

(52  -33) 
65  -5 

(64  -22) 

(56  -73) 

(27  -98) 

(7  •le) 

(1  -21) 

c» 

67  -5-69  -5 

4  -5 

27-75 

95 

93  -25 

31  -5 

4^5 

1 

323 

•29963 

o 

(1-95) 

(7  -49) 

(28  -46) 

(71  -16) 

(87  -34) 

(77  -15) 

(38  -03) 

(9  ^74) 

(1  -65) 

p 

69-5-71  -5 

6  -75 

38  ^25 

61 

77  -5 

39  -5 

11 

2 

236 

•21892 

(1  -42) 

(5  -47) 

(20  •SO) 

(51  ^99) 
5  ^75 

(63  -82) 

(56  ^37) 

(27  •SO) 

(7-11) 

(1  ^20) 

03 

71  -5-73  -5 

-25 

24  ^75 

34  •S 

32  -25 

•5 

105 

•09740 

(-63) 

(2  -44) 

(9  -5) 

(23  •IS) 

(28  -39) 

(25  -08) 
6-75 

(12  -37) 

(3  ^17) 

( -54) 

73  -5-75  -5 

1 

3 

6  -25 

13 

5  -5 

2 

37  '5 

•03479 

( -23) 

(-87) 

(3  -31) 

(8  -26) 

(10  -14) 

(8  -96) 
1  -5 

(4  -42) 

(1  -13) 

(•19) 

75  -5-77  -5 

2-5 

1  -5 

2  -5 

8 

•00742 

(-05) 

(•19) 

(-70) 

(1  -76) 

(2  -16) 

(1  -91) 

(•94) 

(■24) 

(•04) 

77  -5-79  -5 

2 

•5 

1 

3  -5 

•00325 

(•02) 

(•08) 

(•31) 

(-77) 

(■95) 

(•84) 

(  -41) 

(•11) 

(•02) 

Totals.  . 

6-5 

25 

95 

237  5 

291  -5 

257  -5 

127 

32  5 

5  5 

1078 

1  -00000 

To  show  the  effect  of  too  fine  a  grouping,  I  worked  out  the  mean  contingency  for 
the  inch  grouping  in  Table  II.    There  resulted 

y\s  =  -2309,  giving  C.  =  -597. 

I  therefore  conclude  that  with  sufficiently  fine  grouping  the  new  method  of 
contingency  will  give  contingency  coefficients  sensibly  equal  to  the  correlation 
coefficient.  But  that  with  over  fine  grouping,  the  effect  of  individual  units  scattered 
here  and  there  at  random  over  the  table,  becomes  influential  and  exaggerates  the 
value  of  the  correlation.  Hence,  when  a  correlation  table  can  be  formed  and  worked 
in  the  old  ways,  there  is  little  doubt  that  it  is  safer  to  do  so,  and  the  labour  will 
hardly  be  sensibly  greater,  at  least  when  compared  with  the  method  of  mean  square 
contingency.  I  have  not  faced  the  labour  required  to  determine  the  mean  square 
contingency  of  the  table  with  340  sub-groups.  Dr.  Lee  has  worked  out  the  mean 
square  contingency  for  a  table  with  400  sub-groups,  and  we  do  not  think  it  desirable 
to  deal  with  a  table  of  more  than  10^  to  15"  entries  again.  Still  the  mean  square 
contingency  coefficient  will  hardly  be  as  great  on  the  full  table  as  the  mean 
contingency  coefficient. 

The  following  table  gives  the  results  : — 


30        PROFESSOR  K.  PEARSON  ON  THE  THEORY  OF  CONTINGENCY  AND  ITS 


Comparison  of  Methods  of  Finding  Correlation. 


No.  of 

Mean 

Mean  square 

Fourfold 

Correlation 

groupings. 

contingency. 

contingenc3^ 

division. 

table. 

42 

•522 

•465 

(Mean  of  six  divisions)* 

90 

•542 

•513 

•550 

340 

•597 

•514 

Tims  the  first  contingency  method  approaches  tlie  fourfold,  the  second,  the 
ordinary  correlation  method. 

Diagram  II.  at  the  end  of  this  memoir  gives  the  hyperbola  of  zero  contingency  for 
this  case,  calculated  on  the  basis  of  the  correlation  coefficient  being  "514.  The  means 
and  standard  deviations  are  : — 

Father  ....  67"-698,  2"7048, 
Son   G8"T)G1,  2"-7321, 

and  the  equation  to  the  hyperbola  referred  to  the  means  as  the  origin  is 

—  S-8522yx  +  -dSOly"-  =  6-2510. 

The  shaded  squares  are  those  of  positive  contingency.  It  will  be  seen  that  the 
hyperbola  separates  fairly  well  areas  of  positive,  from  areas  of  negative  contingency. 
In  most  cases  where  there  is  an  invasion  across  the  boundary,  the  contingencies 
hardly  diff'er  from  zero  by  amounts  greater  than  the  probable  errors  due  to  random 
sampling. 

Illustration  B. — Data  from  Colour  Inheritance  in  Greyhounds. 

In  the  previous  example  we  have  dealt  with  material  in  which  contingency  methods 
were  directly  comparable  as  to  result  with  the  correlation  found  by  the  "  best  "  or 
product  method  process.  In  this  illustration  I  deal  with  matter  which  can  only 
provide  a  correlation  to  be  found  by  the  fourfold  division  process  for  comparison  with 
the  contingency  coefiicients.  The  data  from  which  this  illustration  is  drawn  were 
extracted  by  Miss  A.  Barrington  from  the  '  Greyhound  Studbook.'  We  deal  with 
the  inheritance  of  red  and  black  pigments  in  the  coat  colour.  I  have  selected  six 
cases  of  the  resemblance  of  brethren  from  different  litters  to  compare  the  methods  on. 
Tables  were  formed  giving  16  to  25  contingency  sub-groups  of  varying  degrees  of 
pigment,  and  these  were  worked  out  («)  by  Miss  Barrington  herself  for  the  mean 
square  contingency,  (/>)  by  myself  for  the  mean  contingency,  and  {c)  by  Dr.  A.  Lee 

*  See  '  Phil.  Trans.,'  A,  vol.  195,  p.  42.  The  values  range  from  -521  to  ^594,  or  almost  the  same  range 
Hs  we  obtain  from  the  mean  contingency  results. 
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for  the  fourfold  correlation  results.  The  results  reached  are  given  in  the  accompanying 
table.  It  is  desirable  to  state  that  the  number  dealt  with  was  about  1000  pairs  of 
brethren  in  each  case. 


Table  V. — Fraternal  Kesemblance  of  Greyhounds  from  Different  Litters. 


Character. 

Ci,  Mean  Square 
Contingency. 

Co,  Mean 
Contingency. 

r,  Fourfold  Table. 

lied  in  Ijrothers  

•478 

•695 

■456 

„  sisters  

•528 

•612 

■620 

,,     sister  and  brother  .... 

•488 

•615 

•450 

Black  in  brothers  

•512 

■615 

•558 

,,  sisters  

•482 

•632 

•552 

,,       sister  and  In-other  .... 

•502 

■622 

•593 

Mean  deviation  from  mean  . 

•498 
■016 

■632 
■032 

•538 
•057 

We  see  at  once  from  this  table  that  the  method  of  mean  square  contingency  gives 
far  more  uniform  results  than  either  the  mean  contingency  method  or  the  fourfold 
division  method.  The  average  given  by  it  is  close  to  what  we  have  found  for 
fraternal  resemblance,  i.e.,  '5,  in  other  cases,  and  within  fairly  close  limits,  all  six 
cases  now  give  '5.  The  mean  contingency  gives  results  more  divergent  among 
themselves,  but  less  so  than  those  of  the  fourfold  division  method ;  their  average, 
however,  diverges  most  from  what  we  have  found  in  other  cases. 

.  The  lesson,  I  think,  to  be  learnt  from  this  is  :  That  the  mean  square  contingency 
coefficient,  although  more  laborious  to  find,  is  better  than  the  mean  contingency 
coefficient.  That  even  with  only  16  to  25  contingency  sub-groups  we  may  deduce 
results  comparable  with  those  obtained  by  fourfold  divisions.  But  that  it  is  probably 
alivays  necessary  to  check  a  series  by  a  certain  number  of  fourfold  division  workings, 
for  such  are  the  only  test  that  we  have  not  got  too  crude  a  grouping  reducing  the 
contingency  coefficient  below  the  correlation  value,  or  too  fine  a  grouping  introducing 
the  difficulty  already  referred  to  (see  p.  IG),  of  magnifying  the  contingency  coefficient 
owing  to  anomalous  units. 

lllmtration  C. — Hair  Colour  in  Mem. 

I  take  the  subject  of  hair  colour  because  it  is  one  in  which  doubts  have  been  raised 
as  to  the  order  of  pigments  in  a  scale. 

The  following  table  gives  the  resemblance  of  pairs  of  brotliers  in  liair  colour  : — 
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Table  VI. 


First  Brother. 

Totals. 

lied. 

Fair. 

Brown. 

Dark. 

Jet  Black. 

1 

Red  

^  J     i  Fair  

§  -g     i    Brown  .... 

Dark  

Jet  Black    .    .  . 

30-5 

23 
16 
12 

23 
416 
158 

67-75 
-25 

16 
158 
394 

98-25 
8-25 

12 

67-75 
98-25 
328-5 
19 

•25 
8-25 
19 
10 

81-5 

665 
674-5 
525-5 
37-5 

Totals  .    .  . 

81-5 

665 

674-5 

525-5 

1 

37^5      1  1984 

The  correlation  found  by  taking  the  mean  of  four  four-fold  table  divisions  was  "621.* 
This  result  is  based  on  the  above  scale  order.    We  will  now  see  what  difference  will 

arise  if  we  work  by  contingency,  so  that  the  scale  order  is  absolutely  indijj'erent,  e.g., 

red  might  follow  jet  black. 
We  find 

=  -603896, 

and  accordingly  =  '614,  a  result  within  the  limits  of  the  probable  error  identical 
with  the  value  of  r  found  from  the  four-fold  division  method. 

This  illustration  confirms  the  opinion  I  have  already  expressed,  i.e.,  that  if  the 
contingency  be  calculated  for  16  to  36  sub-groups  we  shall  obtain  by  the  method  of 
mean  square  contingency  satisfactory  results,  i.e.,  values  close  to  the  coefiicient  of 
correlation  as  found  by  product  moment  or  four-fold  division  methods.  In  this  case, 
as  in  others,  I  find  the  mean  contingency  far  inferior  to  the  mean  square  contingency. 

My  experience  seerns  to  show  that  about  25  sub-groups  is  the  distribution  to  be 
aimed  at ;  9  is  too  few.  Thus  I  worked  out  the  relationship  of  temper  in  sisters  for 
three-fold  division — sullen,  good-tempered,  quick-tempered — or  for  9  sub-groups. 
The  method  of  mean  contingency  gave  "44  and  of  mean  squared  contingency  '36. 
Both  far  too  small,  as  I  find  from  each  of  four  four-fold  divisions  a  result  of  about  "5. 

Illustration  D. — On  Occupational  or  Professional  Corr  elation  hetiveen  Relatives. 

I  take  as  a  final  illustration  a  case  in  which  any  idea  of  scale  is  practically 
incoijceiva})le,  and  yet  one  in  which  it  is  of  considerable  interest  to  measure  the 
deviation  from  independent  probability.  It  belongs  to  a  class  of  problems  in  which  I 
hope  this  new  method  of  contingency  will  l^e  fruitful  of  result.  In  classifying  men 
into  occupational  and  professional  groups,  we  clearly  cannot  do  so  on  the  basis  of  any 

*  "Huxley  Memorial  Lecture,"  'Journal  of  Anthropological  Institute,'  vol.  33,  pp.  197  and  215. 
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scale  which  will  put  the  army,  church,  and  bar  in  any  special  order.  On  the  other 
hand,  it  becomes  of  special  interest  to  determine  how  far  tastes  and  preferences  for 
particular  callings  in  life  run  in  families.  Miss  Emily  Perrin  has  undertaken  a 
lengthy  investigation  of  this  kind,  and  has  provided  me  with  the  pure  contingency 
table  given  as  Table  VII.  The  occupations  of  775  fathers  and  sons  are  here  classed 
in  broad  general  groups,  which  can  be  arranged  purely  alphabetically.  More  minute 
divisions  and  data  for  other  series  of  relatives  will  be  published  later  by  Miss  Perrin, 
and  it  is  not  my  present  purpose  to  anticipate  her  conclusions,  but  merely  to  suggest 
the  valuable  applications  which  may  be  made  of  the  novel  methods  to  pure 
contingency  results.  What  is  the  numerical  measure  of  the  relationship  in  pursuit 
between  father  and  son,  and  how  far  is  it  removed  from  a  mere  chance  relationshij)  ? 


Table  VII. — Contingency  between  Occupations  of  Fathers  and  Sons. 


Occupation  of  Son. 

Nature  of  occupation. 

a 

< 

Teacher,  Clerk, 
Civil  Servant. 

03 

o 

-u 
"3 
'> 
S 

05 

13 

o 
a 

o3 

S-i 

^ 

c« 
03 

p 

s 
s 

o 
o 

jo 

CD 

1  . 

n 

!-i 

o 
O 

a 

03 
m 
_o 
V> 

'o 
Ph 

l—t 

2  '3 

CO 

O 

xn 

o 
H 

Occupation  of  Father. 

Army  .... 
Art     ...  . 

Teacher,  Clerk,) 
Civil  Servant) 

Crafts     .  . 

Divinity  . 

Agriculture  . 

Landownership 

Law  .... 

Literature  . 

Commerce  . 

Medicine .    .  . 

Navy  .... 

Politics     and  "1 
Court  .    .  J 

Scholarship  1 
and  Science  J 

28 
2 

6 

5 

17 
3 

12 

1 

5 

5 

51 

5 

12 
5 
2 

1 
5 
1 
16 

4 
3 

3 

4 
1 

7 

2 
3 
4 
6 
1 
4 
2 

1 

2 

1 

6 
1 

1 

2 

2 
9 

5 
54 

3 
14 

6 

4 
15 

1 

3 
6 

1 

1 

3 

6 
2 

^  

1 
1 

1 

3 
1 

6 

1 
6 
1 

11 

18 

1 

5 

8 
3 

3 
2 

4 

7 
9 
4 
4 

13 
4 

13 
3 
1 

1 
1 

2 

1 

4 
1 
1 
1 

11 

1 

2 

3 

1 
2 
12 
4 
3 
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Totals    .  . 

84 

108 

37  ^ 

11' 

122 

1 

15 

64 

69 

24 

57 

1 

74 

86 

775 

Miss  Perrin  has  extracted  this  first  series  from  the  '  Dictionary  of  National 
Biography' ;  hence  she  has,  as  a  rule,  tabled  the  distinguished,  or  at  least  moderately 
distinguished,  sons  of  less  distinguished  fathers.  It  is,  for  example,  not  easy  to  win 
any  form  of  distinction  in  agriculture.    For  this  reason  the  distribution  of  occupations 
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for  sons  differs  widely  from  that  of  the  occupations  for  fathers.  There  has  accord- 
ingly been  selection  of  the  second  generation,  which  undoubtedly  must  influence  the 
result,  i.e.,  tend  to  weaken  the  observed  relationship. 

Working  out  the  196  contingencies,  squaring,  dividing  by  the  independent 
probability  frequencies,  summing  and  averaging,  I  find  for  the  mean  square 
contingency 

=  1-299206, 

whence 

(/.V(l  +  (/>2)  =  -393794, 

and 'the  coefficient  of  mean  square  contingency  =  -6275.  This  would  correspond  to 
the  correlation  in  occujDation  between  father  and  son.  Now  if  occujiation  were  settled 
solely  by  fitness  or  taste,  and  these  characters  were  inherited  as  other  human  faculties, 
we  should  expect  the  correlation  between  father  and  son  to  be  about  -46.^'  Or, 
roughly,  the  hereditary  relationship  is  increased  by  about  \  in  the  matter  of 
occupation.  Remembering  what  we  have  noted  as  to  selection  above,  the  real 
increment  is  probably  somewhat  larger  than  this.  Roughly,  however,  we  may 
conclude  from  Miss  Perrin's  data  that  about  f  of  the  observed  resemblance  in 
occupation  between  father  and  son  is  due  to  hereditary  influences,  and  the  remaining 
I"  to  environmental  effect.  These  numbers  are  subject  to  revision  when  Miss  Perrin's 
data  are  more  ample  and  have  beeii  more  fully  analysed  and  discussed. 

(12.)  General  Conclusions. 

The  general  conception  of  contingency  developed  in  this  memoir  I  consider  in  the 
first  j)lace  of  theoretical  imjDortance.  Its  practical  applications  are  not  negligible,  but 
are,  for  reasons  given  below,  of  less  importance  than  might  a  priori  be  supposed. 

(a.)  In  the  first  place,  the  conception  of  contingency  enables  us  at  once  to  generalise 
the  notion  of  the  association  of  two  attributes  developed  by  Mr.  Yule.  We  can  class 
individuals  not  into  two  alternate  groups,  but  into  as  many  groups  with  exclusive 
attributes  as  we  please,  and  either  the  mean  contingency  or  the  mean  square 
contingency  will  enable  us  to  see  the  extent  to  which  two  such  systems  are  contingent 
or  non-contingent. 

(b.)  This  result  enables  us  to  start  from  the  mathematical  theory  of  independent 
^probability  as  developed  in  the  elementary  text  books,  and  build  up  from  it  a 
generalised  theory  of  association,  or,  as  I  term  it,  contingency.  We  reach  the  notion 
of  a  pure  contingency  table,  in  which  the  order  of  the  sub-groups  is  of  no  importance 
whatever. 

(c.)  We  then  investigate  the  relation  of  contingency  to  normal  correlation,  and 
find  that  with  normal  frequency  distributions  both  contingency  coefficients  jiass  with 
sufficiently  fine  grouping  into  the  well-known  correlation  coefficient.    Since,  however, 

*  '  Biometrika,'  vol.  2,  p.  379. 
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the  contingency  is  independent  of  the  order  of  grouping,  we  conclude  that  when  we 
are  dealing  with  alternative  and  exclusive  sub-attributes,  we  need  not  insist  on  the 
importance  of  any  particular  order  or  scale  for  the  arrangement  of  the  sub-groups. 

(d.)  This  conception  can  be  extended  from  normal  correlation  to  any  distribution 
with  linear  regression;  small  changes  (t.e.,  such  that  the  sum  of  their  squares  may 
be  neglected  as  compared  with  the  square  of  mean  or  standard  deviation)  may  be 
made  in  the  order  of  grouping  without  affecting  the  correlation  coefficient. 

(e.)  The  results  (c)  and  (d)  are  not  so  fruitful  for  practical  working  as  might  at 
first  sight  appear,  for  they  depend  in  practice  on  the  legitimacy  of  replacing  finite 
integrals  by  sums  over  a  series  of  varying  areas,  where  no  quadrature  formula  is 
available.  If  we,  to  meet  the  difficulty,  make  a  very  great  number  of  small  classes, 
the  calculation,  especially  of  the  mean  square  contingency,  becomes  excessively 
laborious.  Further,  since  in  observation  individuals  go  by  units,  casual  individuals, 
which  may  fairly  represent  the  total  frequency  of  a  considerable  area,  will  be  found 
on  some  one  or  other  isolated  small  area,  and  thus  increase  out  of  all  proportion  the 
contingency.  The  like  difficulty  occurs  when  we  deal  with  outlying  individuals  in 
the  case  of  frequency  curves,  only  it  is  immensely  exaggerated  in  the  case  of 
frequency  surfaces. 

(f.)  It  is  thus  not  desirable  in  actual  practice  to  take  too  many  or  too  fine  sub- 
groupings.  It  is  found,  under  these  conditions,  that  the  correlation  coefficient  as 
determined  by  the  product  moment  or  fourfold  division  methods  is  approximated  to 
more  closely  in  the  case  of  the  contingency  coefficient  found  from  mean  square 
contingency  than  in  the  case  of  that  found  from  mean  contingency.  Probably 
16  to  25  contingency  sub-groups  wiU  give  fairly  good  results  in  the  case  of  mean 
square  contingency,  but  for  each  particular  type  of  investigation  it  appears  desirable 
to  check  the  number  of  groups  proper  for  the  purpose  by  comparison  with  the  results 
of  test  fourfold  division  correlations.  Under  such  conditions  it  appears  likely  that 
very  steady  and  consistent  results  will  be  obtained  from  mean  square  contingency. 

(g.)  Finally,  contingency  may  be  applied — of  course,  at  first  tentatively  and  with 
caution — in  the  consideration  of  a  whole  class  of  problems  in  which  no  attempt  at  a 
scale  or  order  of  sub-groups  is  possible,  in  short,  where  alphabetical  order  is  as  good 
as  any  other.  For  example,  it  would  seem  to  be  available  in  a  vast  range  of  problems 
of  exclusive  and  alternative  inheritance. 


Plate  1. 
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Diagram  11.    Illustrating  areas  of  positive  and  negative  Contingency  and  the  Hyperbola  of  Zero- 
Contingency. 
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N.B.  Owing  to  an  oversight  on  the  part  of  the  engraver,  the  absolute  squareness  of  the  elements  in  the 

original  drawing  has  been  disregarded  in  this  reproduction. 


DRmRS'  COMPANY  RESEARCH  MEMOIRS. 


DEPARTMENT   OF   APPLIED    MATHEMATICS,    UNIVERSITY  COLLEGE, 

UNIVERSITY  OF  LONDON. 

These  memoirs  will  be  issued  at  short  intervals.    The  following  are  nearly  ready  and 
will  probably  appear  in  tbis  series  : — 

Biometric  Seriesf  I. 

I.  Mathematical  Contributions  to  the  Theory  of  Evolution. — XIII.  On  the  Theory  of  Contingency  and 

its  Eelation  to  Association  and  Normal  Correlation.    By  Karl  Pearson,  F.K.S. 
II.  Mathematical  Contributions  to  the  Theory  of  Evolution. — XIV.  On  liomotyposis  in  the  Animal 
Kingdom.  By  Ernest  Warren,  D.Sc,  Alice  Lee,  D.Sc,  Edna  Lea-Smith,  Marion  Radford 
and  Karl  Pearson,  F.R.S. 
III.  Mathematical  Contributions  to  the  Theory  of  Evolution. — XV.  On  the  Theoiy  of  Skew  Correlation 
and  Non-linear  Regression.    By  Karl  Pearson,  F.R.S. 

Technical  Series  IT. 
I.  0)1  some  J'oints  in  the  Theory  of  Structures : — 
A.  On  Masoray  Dams. 

R.  On  the  Relative  Strength  of  Two-^pivoted,  Three-pivoted  and  Built-in  Metal  Arches.  By 
W.  L.  Atcherley,  assisted  by  Karl  Pearson,  F.R.S. 
II.  On  Crane  and  Coupling  Hooks.    By  E.  S.  Andrews,  B.Sc.  Eng. 
III.  On  Torsional  Vibrations  in  Shafting.    By  Karl  Pearson,  F.R.S. 


Published  by  Dulau  and  Co. 

MATHEMATICAL  CONTEIBUTIONS  TO  THE  THEORY  OF  EVOLUTION. 

XL  ON  THE  INFLUENCE  OF  SELECTION  ON  THE  VARIABILITY  AND 

CORRELATION  OF  ORGANS. 

By  Karl  Pearson,  F.R.S. 

'  Phil.  Trans.;  vol.  200,  pp.  1-56.    Price  3s. 

XII.  ON  A  GENERALISED  THEORY  OF  ALTERNATIVE  INHERITANCE, 
WITH  SPECIAL  REFERENCE  TO  MENDEL'S  LAWS. 

By  Karl  Pearson,  F.R.S. 

'  Phil.  Trans.,'  vol.  203,  pp.  53-86.    Price  1.?.  &d. 


Published  by  the  Cambridge  University  Press. 
BIOMETRIKA. 
A  JOURNAL  FOR  THE  STATISTICAL  STUDY  OF  BIOLOGICAL  PROBLEMS. 
Edited  in  Consultation  with  Francis  Galton, 
By  W.  F.  R.  W^eldon,  Karl  Pearson  and  C.  B.  Davenport. 


YoL.  II.,  Paut  IV. 

I.  On  the  Laws  of  Inheritance  in  Mati.  I.  Inheritanee  of 
I'liysioal  Characters.  (With  9  Figures.)  15y  Xabl 
Peakson,  F.K.S.  and  Alice  Lee,  D.Sc. 

II.  Tariatioii  in  Ophioeoma  Nigra  (O.  P.  MijiLEs).  By 
I).  C.  MclNTOSH,  M.A.,  F.R.S.E. 

TTI.  Tables  of  Powers  of  Natural  Jfiimbers  and  of  the  Siims 
of  Powers  of  the  iSTatural  JSi  umbers  from  1-100.  liy 
W.  Paun  Eldebtott.  ; 
IV.  Assortative  Mating  in  Man.    A  CfcoperatlTe  Study. 
Miscellanea.    (I.)  Inheritance  in  Phaseohis  vulgaris.  By 
W.  P.  E.  Weldon  and  K.  PuABiON. 
(II.)  Addendum  to  "Graduation  and  Analysis 
ot  a  Sickness  Table.  '    By  W.  Palin 
Eldebton. 
(III.)  Cra^iiological  Xotes  : 

(iv.)  Homogeneity  and  Heterogeneity  in 
Crania.    By  Ohables  S.  Myeks. 
Eemarljs  on  Dr.  Mtbes'  Note.  By 
X.  Pbauson. 
(v.)  On  Cranial  Typos.     By  Professor 
AukEL  VOJf  TOEOK. 
Kemarks  on  Professor  ton  T6e6k's 
Note.   By  K.  Pbakson. 

The  suhscription  price,  paj^able  in  advance,  is  30s.  net  per  volume  (post  free);  single  numbers  lO.i.  net. 
Volumes  I.  and  11.  (190.2-3)  complete,  30s.  net  per  volume.  Bound  in  Buckram  34s.  6(/.  net  per  volume. 
Subscriptions  may  be  sent  to  Messrs.  C.  J.  Clay  &  Sons,  Cambridge  University  Press  Warehouse,  Ave 
Maria  Lane,  London,  cither  direct  or  throuoh  anv  bookseller. 


Vol.  III.,  Paet  I. 

I.  On  the  Eesult  of  Crossing  Japanese  Waltzing  with 
Albino  Mice.    By  A.  D.  Daebishiee. 
II.  Graduation  of  a  Sickness  Table  by  Makeham's  Hypo- 
thesis.   By  John  Spbncee. 

III.  On  the  Protective  Value  of  Colour  in  Mantis  religiota. 

By  A.  P.  Di  Cesnola. 

IV.  Measurements  of  One  Hundred  and  Thirty  Criminals. 

By  G.  B.  Geiffiths.  With  Introductory  Note.  By 
H.  B.  DoNKix. 
V.  A  First  Study  of  the  Weight,  Variability  and  Correlation 
of  the  Human  Viscera,  with  Special  Eel'erence  to 
the  Healthy  ,aad  Diseased  Heart.  By  M.  Gkeen- 
woOD,  Jun. 

VI.  Sui  Massimi  delle  Curve  DLmoi-fiche.  Dal  Dr.  Feenando 
Helgfeeo. 

Mi.scellanea.    (I.)  On  some  Dangers  of  Eitrapolation.  By 
Emily  Pehein. 
(II.)  On  Differentiation  and  Homotyposis  in 
the  Leaves  ot  Fagus  sylvaiica.  By  !Caul 
Peaeson  and  Makion  EADroEU. 
(III.)  Albinism  in  Sicily  and  Mekdel's  Laws. 

By  W.  F.  E.  Weidon. 
(IV.)  A  Mendeiian's  View  of  the  Law  of  An- 
cestral Heredity.    By  K.  Pearson. 


DEPARTMENT    OF    APPLIED  MATHEMATICS, 

UNIVERSITY  COLLEGE,  UNIVERSITY  OF  LONDON. 


DRAPERS'   COMPANY  RESEARCH 

MEMOIRS. 

BIOMETRIC   SERIES  II. 


MATHEMATICAL  CONTRIBUTIONS  TO  THE 
THEORY  OF  EVOLUTION. 

XIV.  ON  THE  GENERAL  THEORY  OF  SKEW  CORRELATION 

AND  NON-LINEAR  REGRESSION. 

BY 

KARL  PEARSON,  F.R.S. 


[WITH  FIVE  DIAGRAMS.] 


LONDON: 

PUBLISHED  BY  DULAU  AND  CO.,  37,  SOHO  SQUARE,  W. 

1905. 

Price  Five  Shillings. 


DEPARTMENT    OF    APPLIED  MATHEMATICS, 

UNIVERSITY  COLLEGE,  UNIVERSITY  OF  LONDON. 


DRAPERS'   COMPANY  RESEARCH 

MEMOIRS. 

BIOMETRIC   SERIES  II. 


MATHEMATICAL  CONTRIBUTIONS  TO  THE 
THEORY  OF  EVOLUTION. 

XIV.  ON  THE  GENERAL  THEORY  OF  SKEW  CORRELATION 

AND  NON-LINEAR  REGRESSION. 

BY 

KARL  PEARSON,  F.R.S. 


[WITH  FIVE  DIAGRAMS.] 


LONDON: 

PUBLISHED  BY  DULAU  AND  CO.,  37,  SOHO  SQUARE,  W. 

1905. 

Price  Five  Shillings. 


In  March^  1903,  the  Worshipful  Company  of  Drapers  announced  their  intention 
of  granting  £1,000  to  the  University  of  London  to  he  devoted  to  the  furtherance  of 
research  and  higher  work  at  University  College.  After  consultation  between  the 
University  and  College  authorities^  the  Drapers  Company  presented  £1,000  to  the 
University  to  assist  the  statistical  work  and  higher  teaching  of  the  Department  of 
Applied  Mathematics.  It  seemed  desirable  to  commemorate  this— probably.,  first 
occasion  on  which  a  great  City  Company  has  directly  endowed  higher  research  work 
in  mathematical  science — by  the  issue  of  a  special  series  of  memoirs  in  the 
preparation  of  which  the  Departmerit  has  been  largely  assisted  by  the  grant.  Such 
is  the  aim  of  the  present  series  of  ''^Drapers  Company  Research  Memoirs'' 

K.  P. 


Mathematical  Contrihutions  to  the  Theory  of  Evolution. — XIV.  On  the  General 
Theory  of  Skew  Correlation  and  Non-linear  Regression. 

By  Karl  Pearson,  F.R.S. 
Contents. 

Page 


(1.)  Introductory.    General  conceptions  as  to  skew  variation  and  correlation.  General 

theory  of  skew  variation  within  the  limits  of  practical  errors  of  sampling.    ...  3 
(2.)  Generalised  idea  of  correlation.     The  correlation  ratio  r;  and  its  relation  to  the 

correlation  coefficient  ?•  9 

(3.)  Probable  errors  of  the  correlation  ratio  and  other  constants  of  the  arrays.  Probable 

error  of  r  11 

(4.)  On  the  higher  types  of  regression.     Homoscedastic  and  heteroscedastic  systems. 

Homoclitic  and  heteroclitic  systems  21 

(5.)  Cubical  regression.    General  equations  for  regression  of  any  order  23 

(6.)  Parabolic  regression  28 

(7.)  Linear  regression  30 

(8.)  Illustration  A. — On  the  skew  correlation  between  number  of  branches  to  the  whorl 

and  position  of  the  whorl  on  the  spray  in  the  case  of  Aspenda  odorata  31 

(9.)  Illustration  B. — On  the  skew  correlation  between  age  and  head  height  in  girls.  ...  34 
(10.)  Illustration  C. — On  the  skew  coi'relation  between  size  of  cell  and  size  of  body  in 

Daphnia  magna  .  38 

(11.)  Illustration  D. — On  the  skew  correlation  between  number  of  branches  to  the  whorl 

and  position  of  the  whorl  on  the  stem  in  Equisetum  arvense  42 

(12.)  Quartie  regression.    Necessary  criteria  for  various  types  of  regression  47 

(13.)  Illustration  E. — Calculation  of  quartie  regression  in  the  case  of  Equisetum- arvense  .  .  49 
(14.)  General  conclusions.    Nomenclatui'e,  clitic  and  scedastic  curves.    Difference  between 

mere  curve  fitting  and  regression  calculations.    Remarks  on  retention  of  decimals  .  51 


(I.)  Introductory. 

In  a  series  of  memoirs  presented  to  the  Royal  Society  I  have  endeavoured  to  show 
that  the  Gaussian-Laplace  normal  distribution  is  very  far  from  being  a  general  law  of 
frequency  distribution  either  for  errors  of  observation*  or  for  the  distribution  of 
deviations  from  type  such  as  occur  in  organic  populations,  t    It  is  quite  true  that  the 

*  "On  Errors  of  Judgment,  &c.,"  'Phil.  Trans.,'  A,  vol.  198,  pp.  235-299. 
t  "On  Skew  Variation,  &c.,"  'Phil.  Trans.,'  A,  vol.  186,  pp.  343-414. 
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normal  distribution  applies  within  certain  fields  with  a  remarkable  degree  of  accuracy, 
notably  in  a  whole  series  of  anthropometric,  particularly  craniometric,  observations.* 
In  other  fields  it  is  not  even  approximately  correct,  for  example  in  the  distribution  of 
barometric  variations,!  of  grades  of  fertility  and  incidence  of  disease.^  For  such 
cases  I  have  introduced  a  series  of  skew  frequency  curves  which  serve  the  purpose  of 
describing  the  frequency  of  innumerable  skew  distributions  well  within  the  errors  of 
random  sampling.  An  exact  test  for  "goodness  of  fit"  in  the  case  of  frequency 
distributions  has  also  been  now  provided.  § 

In  dealing  with  frequency  which  diverges  more  or  less  conspicuously  from  the 
normal  law  we  require  to  bear  in  mind  at  least  three  important  points  : — 

(i.)  Any  expression  for  frequency  must  be  a  graduation  formula.  It  is  not  a 
disadvantage,  but  a  fundamental  requisite  that  it  should  smooth  off  "  Scheingipfeln," 
so  far  as  these  are  irregularities  within  the  limits  of  random  sampling. 

Hence  formulae  like  those  provided  by  Thiele||  and  Wundt's  pupils,1I  which  depend 
upon  taking  enough  "moments"  to  reproduce  the  complete  frequency,  are  d  priori 
fallacious.  Many  interpolation  formulae  would  do  this  completely,  but  such  inter- 
polation formulae  are  not  graduation  formulae. 

(ii.)  The  graduation  formula  must  not  depend  upon  the  calculation  of  constants 
having  such  a  high  probable  error  that  their  value  is  practically  worthless. 

Now,  the  probable  error  of  high  moments  and  products  increases  rapidly  with  their 
dimensions ;  hence  there  is,  beyond  the  labour  of  arithmetic,  a  practical  limit  to  the 
number  of  moments  or  products  which  can  be  effectively  used  in  a  graduation 
formula. 

(iii.)  There  must  be  a  systematic  method  of  approaching  frequency  distributions, 
which  can  be  applied  to  all  cases  with  reasonably  practical  ease. 

Now  the  immense  majority,  if  not  the  totality,  of  frequency  distributions  in  homo- 
geneous material  show,  when  the  frequency  is  indefinitely  increased,  a  tendency  to 
give  a  smooth  curve  characterised  by  the  following  properties  : — 

(i.)  The  frequency  starts  from  zero,  increases  slowly  or  rapidly  to  a  maximum,  and 
then  falls  again  to  zero — probably  at  a  quite  different  rate — as  the  character  for  which 
the  frequency  is  measured  is  steadily  increased.  This  is  the  almost  universal 
unimodal  distribution  of  the  frequency  of  homogeneous  series.    Homogeneity  may 

*  '  Biometrika,'  vol.  I.,  p.  443;  vol.  II.,  p.  344;  vol.  III.,  p.  230. 
t  'Phil.  Trans.,'  A,  vol.  190,  pp.  423-469. 

X  '  Phil.  Trans.,'  A,  vol.  192,  pp.  257-330 ;  '  The  Chances  of  Death,'  vol.  I.,  pp.  69,  et  seq. ;  '  Biometrika,' 
vol.  I.,  p.  134  and  p.  292;  and  for  disease,  'Phil.  Trans.,'  A,  vol.  186,  pp.  390  and  407;  A,  vol.  197, 
p.  159. 

§  'Phil.  Mag.,'  vol.  50,  1900,  pp.  157-174,  and  'Biometrika,'  vol.  I.,  pp.  154-163. 

II  ' Forelaesninger  over  Almindelig  lagttagelslaere,'  Kjobenhavn,  1889;  'Theory  of  Observations,' 
London,  1903. 

^  WUNDT,  '  Philosophische  Studien.'  A  whole  series  of  papers,  by  G.  F.  Lipps  and  others,  seems  to  me 
to  quite  miss  the  point  of  (i.)  and  (ii.)  above. 
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for  practical  purposes  be  taken  to  imply  unimodality,  although  the  converse  is  very 
far  from  true. 

(ii.)  In  the  next  place  there  is  generally  contact  of  the  frequency  curve  at  the 
extremities  of  the  range.  These  characteristics  at  once  suggest  the  following  form  of 
fi"equency  curve,  if  yhx  measure  the  frequency  falling  between  x  and  x-\-hx  : — 

^^a^-It^  ,  ('■)• 

For  in  this  case  we  have  one  mode  only  of  the  frequency,  i.e.,  at  x=—a,  and 
dy/dx  will  vanish  when  2/=0. 

But  the  assumption  of  this  form,  as  long  as  F  (x)  is  general,  is  itself  extremely 
general,  and  it  includes  cases  in  which  dy/dx  may  not  be  zero,  but  take  any  values 
from  0  to  Qc  ,  when  y  =  0.^ 

Now  let  us  assume  that  F(a;)  can  be  expanded  by  Maclaurin's  theorem,  and 
equals  6q+ 61^+62^^ +^3^'+  .  •  •  •  Then  our  differential  equation  to  the  frequency 
will  be 

1  dy   x-\-a  ,..  . 

y  dx      bQ-\-h^x-\-h^x^ -\-h^x^ -\-   ^  ''' 

There  is  now  absolutely  no  difficulty  in  determining  the  unknown  constants  in 
terms  of  the  moments  of  the  system.  Multiply  up  and  also  by  x",  and  then  integrate 
throughout  the  range  of  frequency,  we  have 

^x"  {hQ-\-h^x-\-b^x^^h.^x^-{- .  .  .)^dx  =  ^y  [x-\-a)x''dx    .    .    .  (iii.). 

Or,  noting  that  y=0,  at  the  ends  of  the  range  we  have,  with  the  usual  notation  for  a 
total  frequency  N,  i.e., 

Nfi'„=^yx"dx  (iv.), 

the  result  by  integration  by  parts 

w6o/x'„_i  +  (w+ 1)        +  (w+2)  +  (^^+3)  b^fi'^^.,  +  .  .  .  =  —fj,\^^^—af^'„  (v.). 

Hence,  if  we  write  /<  =  (),  1,  2,  3  ...  6^  successively,  we  have  s-\-  i  equations  to  find 
a,  6q,  b^,  b^  .  .  .  b,_i  in  terms  of  the  moments.  For  example,  if  we  stop  at  6^  we 
require  two  moments,  at  6^  three  moments,  at  6^  four  moments,  at  b^  six  moments,  at 
b^  eight  moments,  and  at  b,_y,  s>2,  26-  — 2  moments. 


*  For  example,  cases  in  which  there  is  a  minimum  frequency  or  antimode  at  a;  =  -  a,  and  di//dx  infinite  at 
one  or  two  values  for  which  y  =  0,  as  in  the  frequency  distributions  discussed  in  '  Phil.  Trans.,'  A,  vol.  186, 
pp.  364-5,  and  '  Roy.  Soc.  Proc.,'  vol.  62,  p.  287,  "  Cloudiness,  a  Novel  Case  of  Frequency." 
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There  is  no  difficulty  whatever  in  finding  the  6's  ;  we  have  the  system  of  equations  ; 
where  )U''o=l 

/a'o&o+ Vi^i  +  3/a'A+  VA+ 


(vi.). 


Hence,  a,  6(„  6^,  63,  63,  ,  .  .  are  at  once  given  in  terms  of  the  determinant  A  and 
its  minors,  where : 


A  = 


0, 


3/2,  4/x'3, 
5ix\,  6/5, 


/x'2,    2/1,    3/2,  4/x'3, 
/^'s'    ^A^'g'    4/3'    5/4,    G/a's,  Tfx'^, 

4/a'3,      5/x'4,      G/a's,  S/x'^, 


(vii.). 


The  results  may  be  simplified  slightly  by  taking  the  origin  at  the  mean,  and  the 
moments  about  the  mean,  indicating  this  by  dropping  the  dashes  and  putting  ju,'i  =  0. 

Thus  we  have  the  following  series  of  frequency  curves,  the  origin  being  the 
mean  : — 

(i.)  Keeping  only 


1  dy         ^  / 

y  dx 


(viii.). 


This  is  the  Laplace-Gaussian  normal  form, 
(ii.)  Keeping  6^,  only 


ldy__  ^2ja2 


(ix.). 


This  is  the  Type  III.  curve  of  my  memoir  on  skew  variation.* 
(iii.)  Keeping  h^,  6j^,  only 

1  dy_  •'^"^10/x2/^4-18/x2'-12/x3' 


y  dx  _/X2(W4-W) 


0^  +  ' 


(X.). 


IO/X2/A4-I8/X22- 12/^32^  10)012/^^- 18/Ll2^-12)Lt3^      '  10/X3^4-18/X33_12^3' 


*  'Phil.  Trans.,'  A,  vol.  186,  p.  373. 
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This  equation  gave  Types  I.-VI.  of  my  two  memoirs  on  skew  variation,*  and 
provides  at  once  the  expressions 

=  distance  from  mode  to  mean  = -^r^^^^~|^^^ .    .    .    .  (xi.), 

where  cr  =  \/ fji2,  /^i  =  t^^^/f^o^,  Pi  =  l^Jl^z,  given  in  my  memoir  on  the  theory  of  errors 
of  observation  without  proof  t 

There  is  no  theoretical  limit,  however,  to  this  process;  we  can  from  (vi.)  and  (vii.) 
express  the  a  and  6's  at  once  in  terms  of  determinants,  and  expanding  obtain  forms 
which,  hke  the  formulae  of  Thiele,  will  fit  closer  and  closer  to  the  observed 
distribution  of  frequency,  the  more  moments  we  take.  But  there  are  three  fundamental 
practical  objections  to  this.    These  are  the  following  : — 

{a.)  Experience  shows  that  the  form  (x.)  suffices  for  certainly  the  great  bulk  of 
frequency  distributions,  i.e.,  it  describes  them  effectively  within  the  limits  of  random 
sampling. 

If  the  distribution  be  even  approximately  normal,  the  series  in  the  denominator 
converges  very  rapidly,  for  the  coefficients  of  every  power  of  x  vanish  for  moments 
obeying  the  relationships  : — 

which  hold  for  a  normal  series. 

(6.)  The  labour  of  arithmetic  and  of  analysis  becomes  very  great,  if  we  desire  to 
keep  higher  moments.  If  we  go  to  we  should  ha  ve  to  calculate  the  first  eight 
moments  of  the  observations  about  their  centroid  — a  by  no  means  easy  task.  Further, 
the  classification  of  the  resulting  curves  and  the  criteria  for  the  right  one  to  use  in  a 
special  case,  although  not  absolutely  prohibitive,  if  we  only  go  as  far  as  63,  are  for 
practical  purposes  idle  in  the  case  of  taking  into  account  64. 

(c.)  The  probable  errors  of  the  higher  moments  are  so  large  that  the  values  found 
for  fi^,  /xg,  &c.,  are  quite  untrustworthy,  and  even  that  for  fi^  is  doubtful, J  unless  we 
have  frequency  series  far  larger  than  usually  occur  in  actual  observations.  This  is  a 
strong  argument  against  the  utility  of  any  descriptions  of  frequency,  such  as  those 
suggested  by  Thiele  or  Lipps,  which  depend  upon  moments  higher  than  the  fifth 
or  sixth. 

*  'Phil.  Trans.,'  A,  vol.  186,  pp.  343-414,  and  '  Phil.  Trans.,'  A,  vol.  197,  pp.  443-459. 
t  'Phil.  Trans.,'  A,  vol.  198,  p.  277. 

t  In  'Phil.  Trans.,'  A,  vol.  185,  pp.  71-110,  I  have  given  a  method  of  breaking  up  a  frequency 
distribution  into  two  normal  series.  I  obtained  long  ago  the  criterion  for  determining  whether  such  a 
resolution  is  possible  or  not.  But  it  involves  moments  higher  than  the  fifth,  and  the  probable  error  of  the 
criterion  is  thus  so  great  that  for  practical  purposes  it  is  worthless. 
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The  question  of  the  probable  deviations  of  the  higher  moments  can  be  illustrated  as 
follows,  by  finding  the  standard  deviation  of  the  moment  when  we  take  a  number  of 
random  samples  from  a  general  population.  Let  t^,  be  the  standard  deviation  of  fi,, 
then  loot  J [Ji's  is  the  percentage  variability  of  fXs  due  to  random  sampling.  The  table 
below  shows  the  increase  of  these  percentages  in  the  case  of  the  moments  of  normal 
distributions,  which,  quite  as  well  as  any  other,  will  illustrate  the  rapid  increase  in 
probable  error  as  we  use  higher  and  higher  moments.  The  general  values  of  the 
standard  deviations  of  some  of  the  moments  were  first  given  by  Czuber,*  then 
far  more  completely  by  Sheppard,!  and  a  resume  of  all  the  results  recently  in 
'  Biometrika.'l 

Percentage  Variability  in  Moments  due  to  Random  Sampling  when  the  Series 

is  supposed  to  be  Normal. 


Moment. 

500  in  series. 

1000  in  series. 

6-3 

4-5 

in 

14-6 

10-3 

30-1 

21-3 

60-6 

42-9 

Precisely  the  same  rapid  increase  takes  place  when  we  find  the  variabilities  of  the 
ratios  fJ-JfJi-z',  l^^Jl^zy  I^Jl^z^  ^^-i  which  are  the  forms  in  which  the  moments  actually 
occur  in  our  coefficients.  In  this  case  we  have  to  remember  that  errors  in  the 
moments  are  correlated,  but  the  correlations  are  given  in  the  papers  cited  above. §  I 
find  in  this  case  the  following  series,  which  is  almost  as  suggestive  as  the  previous 
table. 

Percentage  Variabilities  in  Ratio  of  Moments  due  to  Random  Sampling,  the 

Series  being  Normal. 


Ratio. 

500  in  series. 

1000  in  series. 

7-3 

5-2 

23-3 

16-5 

55-1 

390 

The  order  of  this  increase  of  percentage  variability,  and  therefore  of  probable  error, 
is  the  same  for  skew  as  for  normal  variation,  and  it  seems  therefore,  with  the  length 

*  'Theorie  der  Beobachtungsfehler,'  S.  130,  et  seq. 
t  'Phil.  Trans.,'  A,  vol.  192,  pp.  122,  et  seq. 
I  Vol.  n.,  pp.  273-281. 
§  Ihid.,  p.  277. 
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of  the  series  in  customary  use,  idle  to  use  the  7'^  or  8"'  moments ;  these  have 
variabiKties  varying  from  30  to  60  per  cent,  of  their  values,  and  accordingly  we  might 
easily  on  a  random  sample  reach  a  7*"  or  8"'  moment  having  half,  or  double  the  value 
it  actually  has  in  the  general  population.  Constants  based  on  these  high  moments 
will  be  practically  idle.  They  may  enable  us  to  describe  closely  an  individual  random 
sample,  but  no  safe  argument  can  be  drawn  from  this  individual  sample  as  to  the 
general  population  at  large,  at  any  rate  so  far  as  the  argument  is  based  on  the  constants 
depending  upon  these  high  moments. 

It  seems  to  me  accordingly  obvious  that,  bearing  in  mind  the  object  of  a  theory  of 
frequency  (i.e.,  the  description  of  the  distribution  in  the  general  population  by  aid  of 
a  graduated  sample,  agreeing  with  the  general  population  within  the  probable  errors 
of  random  sampling),  we  can  dismiss  from  practical  use  all  theories  which  call  upon 
us  to  use  moments  as  high  as  the  seventh  or  eighth.  Any  use  of  the  general  form 
(ii.)  beyond  />3,  indirectly  or  directly,  involves  such  higher  moments.  Personally  I  am 
inclined  to  doubt  whether  the  continental  series  using  higher  moments  are,  from  the 
standpoint  of  graduation,  nearly  as  good  as  my  form  (ii.). 

Hence  we  seem  driven  to  the  skew  curves  embraced  in  (x.)  as  a  practical  frequency 
series.  If  we  have  a  frequency  not  described  by  (x.)  we  may,  perhaps,  use  [x^  and 
but  it  is  difficult  to  see  how  its  description  can  possibly  be  bettered  by  the  use  of 
still  higher  moments.  This  may  seem  a  counsel  of  despair ;  but  it  is  very  far  from 
being  so  in  reality  when  we  remember  that  (x.)  has  proved  its  efficiency  now — I  might 
almost  say,  without  exception — in  a  wide  range  of  economic,  physical,  biometric,  and 
actuarial  data. 

In  this  memoir  on  skew  correlation  I  shall  accordingly  confine  my  attention,  for  the 
most  part,  to  constants  the  discovery  of  which  does  not  involve  the  ifse  of  moments 
or  products  of  higher  than  six  dimensions,  judging  all  above  this  limit  to  be,  as  a  rule, 
disqualified  for  practical  service  by  the  magnitude  of  their  probable  errors. 

(2.)  Generalised  Idea  of  Correlation. 

Given  any  two  variables  or  characters  A  and  B,  we  say  that  they  are  correlated 
when,  with  different  values  x  of  A,  we  do  not  find  the  same  value  3/  of  B  equally  likely 
to  be  associated.  In  other  words,  certain  values  of  B  are  relatively  more  likely  to 
occur  with  the  value  x  than  others.  The  distribution  of  B's  associated  with  a  given 
value  of  A  is  termed  an  ic-array  of  B's.  If  N  pairs  of  A  and  B  are  taken,  and  n.^,  of 
these  have  the  character  A  =  a;,  these  form  the  x-array  of  B's.  This  array,  like  any 
other  frequency  distribution,  will  have  its  mean,  which  we  will  denote  by  y_c,  and  its 

*  Referring  to  equation  (ii.),  I  propose  to  call  curves  which  stop  at  hq  skew  curves  of  the  j"'  order. 
Thus  the  normal  curve  is  a  skew  curve  of  zero  order;  curve  of  Type  III.  is  a  skew  curve  of  the  P'  order; 
Types  I.,  II.,  v.,  and  VI.  are  of  the  2"''  order.  I  hope  shortly  to  publish  a  discussion  of  skew  curves  of  the 
S"*  order  to  complete  the  practically  legitimate  range  of  such  curves. 
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standard  deviation,  which  we  will  denote  by  a„^.  The  mean  of  all  the  B  characters 
shall  be  y  and  their  variability  given  by  the  standard  deviation  Similarly  x,  o-, 

will  denote  the  mean  and  standard  deviation  of  the  A's,  and  n^,  Xy,  and  a„^  the 
number  of  individuals,  the  mean  and  the  standard  deviation  for  a  ?/-array  of  A's. 

Now  clearly  a  knowledge  of  and  o-„,  will  not  fix  the  B's  which  will  be  found 
associated  with  a  given  A,  but  it  will  define  the  limits  of  probable  or  even  possible 
B's.  The  curve  obtained  by  plotting  to  x  is  termed  the  regression  curve  of  y  on  x. 
A  curve  in  which  the  ratio  of  o-,,^  to  the  standard  deviation  a-y  is  plotted  to  x  may  be 
termed  a  scedastic"*  curve.  Since  the  standard  deviation  is  always  a  positive 
quantity,  this  curve  always  lies  on  one  side  of  the  axis ;  it  is  a  horizontal  line  in  the 
case  of  normal  correlation — i.e.,  the  Gauss-Laplacian  distribution  of  deviations — and 
coincides  with  the  axis,  in  any  case  where  correlation  passes  into  causation,  i.e.,  when 
one  value  of  B  only  is  associated  with  each  A. 

The  mean  ordinate  of  this  curve  would  clearly  be  a  sort  of  general  measure  of  the 
degree  of  correlation  between  A  and  B,  but  it  seems  for  many  reasons  better  to  base 
our  measure  on  the  mean  square  of  the  weighted  standard  deviations  of  the  arrays,  or 

o-,;  =  S(n.o-J)/N  (xiii.). 

(Ta^  will  thus  measure  the  average  variability  in  B  to  be  found  associated  with  any  A, 
its  vanishing  will  mean  that  the  scedastic  curve  as  defined  above  will  coincide  with 
the  axis.    Now  let  a  new  quantity  17,  defined  by 

<  =  (l-r,2)cr/  (xiv.), 

be  introduced.  Then  clearly  7]  must  lie  between  ^1,  because  cTa^  cannot  be  negative, 
being  the  sum  of  a  number  of  positive  squares.  I  term  7)  the  correlation  ratio,  to 
distinguish  it  from  the  correlation  coefficient  represented  by  r.  When  17=  ±1  the 
correlation  is  perfect  or  we  have  causation.  Further  we  have  by  a  well-known 
property  of  moments,  if 

a„^  =  B{n^{y^-yf}/l^  (xv.), 

or 

17  =  o-mja-y  (xvi.). 

This  shows  us  that  the  correlation  ratio  is  the  ratio  of  the  variability  of  the  means 
of  the  x-arrays  to  the  variability  of  B's  in  general.  If  17  =  0,  it  follows  that  (Jm,  is 
zero,  or  from  (xv.)  that  every  y„=y,  i.e.,  there  is  no  association  of  B's  with  special 
A's  at  all,  or  correlation  is  zero.  Thus  the  correlation  ratio  -q,  as  defined  by  either 
(xiv.)  or  (xvi.),  is  an  excellent  measure  of  the  stringency  of  correlation,  always  lying 
numerically  between  the  values  0  and  1,  which  mark  absolute  independence  and 

*  I.e.,  a  curve  which  measures  the  "  scatter"  in  the  arrays. 
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complete  causation  respectively.  Further,  remembering  the  definition  ot  r,  the 
coefficient  of  correlation,  i.e., 

No-^CT-yX^^  =  ^[n^^{x—x){y—y)}, 

=  ^[n^{x—x){y„—y)]  (xvii.), 

we  have,  from  (xv.)  and  (xvii.), 
Now  let 

Y=y+^^(x—  (xviii.), 

(To: 

then  (xviii.),  as  is  well  known,  gives  the  best  fitting  straight  line  to  the  series  of 
points  yn.  loaded  with  their  respective  m^.    We  can  now  write 

^  W-r^)<^y'  =  ^{n.{yn-Yf]+^{n.{Y-y){y,,-Y)]. 

But,  using  (xviii,), 

S[n4Y-^-)(2/„-Y)]  =  ^^sk(x-^){2/„-^--^(x-x)}l, 

=  0. 

Thus  the  last  summation  vanishes,  and  we  have 

^{v'-r')<T/  =  S{n4y„-Yf}  (xix.). 

The  right-hand  side  must  always  be  positive,  unless  y„=Y,  when  it  is  zero.  Hence 
we  conclude  that  17  is  always  greater  than  r,  or  the  correlation  ratio  greater  than  the 
correlation  coefficient,  except  in  the  special  case  when  the  means  of  the  ic-arrays  of  ?/'s 
all  fall  on  a  straight  line,  i.e.,  we  have  linear  regression,  and  then  the  two  correlation 
constants  are  equal. 

Thus  the  expression  {yf  —  r^)  cr/  has  an  important  physical  meaning ;  it  is  the  mean 
square  deviation  of  the  regression  curve  from  the  straight  line  which  fits  this  curve 
most  closely.*  We  have  now  freed  our  treatment  of  correlation  from  any  condition 
as  to  linearity  of  the  regression,  and  it  remains  to  consider  the  probable  errors  of  the 
various  quantities  dealt  with, 

(3.)  Probable  Errors  of  Constants  of  Correlation. 

We  shall  first  prove  a  number  of  general  propositions  relating  to  the  probable 
errors  of  correlation  constants.    We  first  note  that  if  n  and  n'  be  the  frequencies  in 

*  The  properties  of  the  correlation  ratio  were  briefly  noted  in  a  footnote  to  a  paper  by  the  author  in 
'  Roy.  Soc.  Proc.,'  vol.  71,  pp.  303-4.  It  has  been  systematically  used  in  my  laboratory  for  some  years 
and  determined  longside  r  for  many  distributions. 
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any  two  sub-groups  of  a  total  N,  for  which  no  member  of  n  is  a  member  of  n',  then 
the  standard  deviation  of  n  due  to  random  samphng  is  given  by 

and  the  correlation  between  deviations  in  n  and  n'  due  to  random  sampling  is  given 
by 

=  —  ^  (xxi,). 


Problem  I. — To  find  the  correlation  in  deviations  due  to  random  sampling  between 
the  number  n^^  in  the  Xp-array  of  y^s  and  the  number  ny^  in  the  y^-array  of  x's. 

If  the  symbol  Sn  denote  the  error  or  deviation  in  n,  we  have  with  an  obvious 
subscript  notation* 

if  there  be  q  groups  of  y's,  and  again 

if  there  be  i  groups  of  x's. 

Multiply  the  expressions  for  Sw^^  and  Sw^.  together  and  we  have 

where  the  summation  is  for  every  pair  of  values  of  u  and  v,  differing  from  s  and  p. 

Summing  all  such  pairs  of  values  for  every  random  sample  and  dividing  by  the 
number  of  samples  taken,  we  have  the  usual  definition  of  correlation 


N 


or, 

-S..  "R       =^  — - 

N 


This  gives  the  required  correlation,  since       and  X«„  are  known  from  (xx.). 

Problem  II. — To  find  the  correlation  between  deviations  in  the  total  n^^  of  any  array 
and  in  any  sub-group  n^^^y,  of  this  array. 
We  have  at  once 

8^^^p  Bnx0, 

where  u  is  to  be  taken  every  value  other  than  s  in  the  summation  term.  Summing 
for  all  random  samples  and  dividing  by  their  number,  we  have,  after  using  results 
like  (xx.)  and  (xxi.), 

R„,^„,^,xS„,X,  =  n.,,.(i-^)  (xxiii.), 

which  gives 

*  'fi'zy  —  frequency  of  groups  with  characters  x  and  y. 
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Proposition  III. — There  is  no  correlation  betiveen  deviations  in  the  mean  oj  an 
x-array  y^,^  and  the  total  number  in  that  array. 

Hence  as  before,  using  (xxiii.),  &c., 

7i.^t,,%,fi,^,„^  =  -y.n^^  (^1  -  ^)  +  S  [n^,yu{l  -  ^)  } 

=  0, 

which  proves  that  'Ry^n,^  is  zero. 

Proposition  IV. — There  is  no  correlation  between  deviations  in  the  mean  of  an 
x-array  and  in  the  total  number  in  any  other  array. 

Proof  as  before. 

Proposition  V. — There  is  no  correlation  between  deviatio^is  in  the  mean  of  one 
x-array  and  in  the  mean  of  a  second  x-array. 
We  have 

^y-v = s  {^^Ky:y>) — v-^v  s^^^P' 

n.^'^y.,'  —  S  {hi,,y^j„)—y^,hn^,. 

Multiply  these  two  expressions  together,  sum  for  all  random  samples,  and  divide 
by  the  number  of  such  samples.    We  find 

nxn.xj2^y,2,y^^iXy^^y^^,    —  —yX.yX^'  ^ 

{nxnx,.yjju)l^ 
-\-yx,S'  {nx,'n^^yjy^)/l^ 
—  S  {nx^yn^^,yjyJ')/N 
— S'  {nx^yjix^'y^yuyu)!^ 

—  yx^Xf'  jq^'  ~Tyxj, 

,      n^^'Ux     ^  .^{nx,,J/u)xS{n^^,yjy,,) 

The  last  term  is  ^^^.^^p^^-^p'^-^p'  ^  and  thus  the  right-hand  side  is  identically  zero.  It 

thus  appears  that  there  is  no  correlation  between  errors  made  in  finding  the  means  of 
two  arrays.  This  result  is  not  at  once  obvious,  although  a  very  little  consideration 
shows  it  must  be  true. 
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Proposition  VI. — To  prove  that  the  standard  deviation  of  the  mean  y^^  of  any 

x-array  due  to  random  sampling  equals  ^— — 
We  have 

^x.  §2/^,  =  S'  {Bn:,^yi/u)—y^^  Sn^^. 

Square,  sum  for  all  random  samples,  and  divide  by  the  number  of  such  samples. 
We  have 


=^{n.,y:yJ^)-n^,y.J^ 


z=n^(r„^K 


Hence 


ty,=(T„Jx/n^^  (xxiv.). 


Thus  the  probable  error  of  the  mean  of  an  array  has  exactly  the  same  form  as  the 
probable  error  of  the  mean  of  a  random  sample  of  a  definite  number  of  individuals. 
The  array  may  have  a  variable  number  of  individuals,  but  we  have  seen  in 
Proposition  III.  that  there  is  no  correlation  between  errors  in  its  mean  and  errors  in 
the  total  number  of  individuals  contained  in  it. 

Frohlem  VII. — To  find  the  probable  error  of  the  standard  deviation  of  any  array. 

By  a  precisely  similar  investigation  to  that  of  the  previous  proposition  we  find 


s..  =  a/^^^  (Hxv.), 

where 


m.. 


This  is  identical  with  the  probable  error  we  should  have  if  the  array  were  a  random 
sample  of  constant  size. 

In  many  cases  it  will  be  sufficiently  approximate  to  put  m4=3m/  and  we  then 
have 

•67449  S,.  =-67449   (xxvi.), 

\/2n^^ 
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the  well-known  form  for  the  probable  error  of  the  standard  deviation  of  a  normal 
distribution  of  a  definite  number  of  individuals. 

Problem  VIII. — To  Jind  the  standard  deviation  of  the  standard-deviation  ctm  of  the 
means  of  the  arrays  due  to  random  sampling. 

Since 

No-M^  =  S  {n,Xy,—yf} 
2No-MScrM=S  {hn^^{y:,—yf}-\-2^  {hy,.n^^{y^—y)]  —  'lhy^  bi'^c.iyx—y)], 
the  last  term  of  which  vanishes,  since 

Ny=S  {n,;y,). 

Square  the  above  relation,  sum  for  all  random  samples,  and  divide  by  the  number 
of  such  samples. 
We  find 


4NVm^2 J=S  {n.^  (l  {y.-yf  } 


-2S 


+4S{2,,S,.R,,^,,^(2/.-^7} 

+4S  \ty^Xy.^Ry^,y,^  ky^-y) 

+  4S 

But  Il«,^,^,  and  ^y^^y,^,  vanish  by  Propositions  III.,  IV.,  and  V.    Further,  by 

VI.,  S^=o-„  ^In^.    Hence  we  have 

-2S\VA'{y.-yY{y^,-yf'^ 
+  4S{^.,a-„.;(2/.-y)2} 

Now  let 

m^=S{n,^{y,-yy} 

be  the  moment  of  the  means  of  the  arrays  about  their  mean.  Then  clearly 
K=o-M~-    Further,  since  S  {n^a,,^  ^)  =         {l—r]^),  we  can  write 
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where  is  a  purely  numerical  constant,  which  is  equal  to  unity  for  those  cases  in 
which  there  is  no  correlation  between  the  standard  deviation  of  an  array  and  the 
square  of  its  mean's  deviation  from  the  mean.    Thus  finally  we  find 


2 —  'H 


^4  A^o 


 r   —    (xxvn. ). 


4NX 

This  enables  us  at  once  to  find  the  probable  error  of  the  standard  deviation  of  the 
means  of  the  arrays. 

Proposition  IX. — To  find  the  correlation  between  the  deviations  due  to  random 
sampling  in  the  values  of  cTy  and  ctm- 
We  have 

Ncr/  =  S[n,(2/-y)^}, 

the  last  term  vanishes  because  S  {ny^y,)=Ny. 
Thus 

2N   So-y= S  { hiy^  [ys—yf]  • 

But  from  the  previous  proposition 

2NcrMSo-M=S{Sn^^  {y,-yf]  ■^2^{hy,n,^  {y^-§)]' 

Multiply  these  two  expressions  together,  sum  for  all  random  samples  and  divide  by 
the  number  of  such  samples  ;  we  find 


To  evaluate  this,  we  require  to  find  the  two  correlations  expressed  by  R«^„^  and 
E,„^y^.    We  will  consider  the  two  summation  terms  separately. 

First  Term.    ^7i^=hi_,^y^-\-hn^^y^-\-  . . .  +8^^,^,+  . . . 

hny^hly^:,^  +  Sn^.^^+    .    .    .     -\-^ny^X,+    •    •  ' 

where  in  the  summation  p'  and  s'  are  not  equal  to  p  and  s. 
Proceeding  in  the  usual  manner  we  find 
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where  in  the  first  sum  s'  is  to  take  all  possible  values,  and  in  the  second  j)'  is  to  take 
all  possible  values.    Thus  we  have 

S„,S„„R,,^«„=n.,,-^'^'  .    .......  (xxviii.). 

Substituting  we  find 

First  Term  =  Si[n,,,^,{y,-^Y 

Here  both  the  summations  are  really  double  summations ;  fixing  our  attention  on 
any  x^,  i.e.,  on  any  array  of  ys  for  a  given  value  of  x,  we  have  first  to  sum  for  all  y's 
in  this  array,  and  then  we  have  to  sum  for  all  arrays.  This  is  the  meaning  of  Sp  In 
S3  we  are  to  associate  every  array  of  x's  with  every  array  of  y's ;  hence  this  term  will 
break  up  at  once  into  two  factors,  i.e., 

Keeping  Xp  constant  first  in  S^,  we  see  that 

is  the  2"*^  moment  of  the  y's  in  the  Xj,  array  about  the  mean  of  the  system 

Combining  we  have 

First  Term  =  S(n.^  (i/.^-j/)*} +S{n.^(r„,/ (y. -^)-^} -Ncr.W 

=  N{X,+  o-/crMMl-r)Xi-^5Wl  ......  (xxix.). 

We  now  turn  to  the  second  term  which  involves  the  discovery  of  R,,^  y, . 

S?^y.S2/^,=(S%^,+S%...,+  . . .  -f  Sri3,,^,+  . . .)  §2/^, 
n^,  ^y^,  =  —  y.r^  Sn^, + S  (Sn^„^„y„). 

Hence 

n:,hnyhy^=  —y^^  (g^^_^  _|-S,7^_^^_|-  . .  .  -|-  Sn^^^__+  . . .)  hi^^ 

+  (S>^y„r,4-S%.r,+  . .  .  +§5^^.^,+  . . .)  S  (Sn,,^y„y„). 
Sum  for  all  random  samples  and  divide  by  the  number  of  such  samples ;  we  have 

n,;%n^ty,R,^^,=  N^') 

=  '«-.y.(/A-2/.0  (xxx.). 

c 
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Substituting  we  have 

Second  Term  =  2S(n,.,,.  (y,-?/,,,,)  (y,-^)^  (y,. 

Here  again  the  summation  is  of  a  double  character. 

Let  us  first  take  Xj,  as  constant  and  sum  for  every  value  of  ijs.  We  may  write 
y,—  r/={ys—y.r„-\-yx—y),  and  our  first  summation  will  be 

2  {y.r,-y)  X  s  [n.,,,,  {(y,,-?/,,  )3+2  {ys-y.rj  {y.-§)Mys-y.)  {y^-§f}  ] 

=  2(.y..,-i/)  n,m.,-\-4.{y,-yY  n,m.-\-2  {y,-yf  S  {n,^^^{y,-y,y^, 

if 

w^TO,^=s  {nr^y.{ys—y^y- 

The  last  term  vanishes  for  ^  {nr^,y,ys)  =  )h-^,y.r^  by  the  definition  of  the  mean. 
Hence 

Second  Term  =  2S  {n,.m^         ,^)} +  4S  \n.,^o-„^^  {y,,—ijY]. 

Here  m.^  is  the  third  moments  of  the  Xp  array  of  ?/s,  which  will  probably  be  very 
small  if  the  arrays  are  nearly  symmetrical  and  the  first  term  clearly  depends  on  the 
existence  of  a  correlation  between  the  skewness  of  the  arrays  and  the  magnitude 
of  their  means. 

We  may  write  the  first  term  then  : 

=  2N(T/(l-r;2fV,,XX2, 

where  Xs  is  a  purely  numerical  quantity,  which  for  most  cases  will  probably  be  very 
small  or  even  zero. 
Thus  we  find : 

Second  Term  =  2N(t/  (1  -r;3)3/VMX2+  4N(r/o-M '  (l  — >7')xi  •    •  (xxxi.). 

We  can  now  return  to  p.  16  and  write  down  the  full  correlation  between  deviations 
in  the  values  of  cr^  and  ctm  due  to  random  sampling.  Remembering  that  o-M  =  i70"y,* 
we  find  : 

=  ^{j^,+h{l-v')x-iv+U^-vTx.}  ■    ■    •    •  (xxxii.). 

*  It  should  be  remembered  that  this  definition  of  ?/  gives  it  invariably  the  positive  sign. 
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Proposition  X. — To  find  the  standard  deviation  of  the  values  of  the  correlation 
ratio  7]  due  to  random  sampling,  i.e.,  to  find  the  probable  error  of  the  correlation 
ratio  7). 

We  have 

7]  —  CTm/cTv. 

Hence 

Siy         ScTji  So",; 

Squaring,  summing  for  all  random  samples  and  dividing  by  the  number  of  such 
samples,  we  have  : 

t^Jis  given  (xxvii.),  2^^2<,^E  by  (xxxii.)  and  2^^^-=  tti^lts.  by  a  well-known 
formula.* 

Substituting,  we  have  the  complete  value  of      given  by  : 
or,  after  re-arranging, 

+(Xi-l)(l-^')(l-t^')-X3'?(l-^Tj     .    .    .  (xxxiii.). 
For  normal  correlation,  ju-4=3/x,3^.  Further 

and 


=  ^%*xN3cr/=3N\/. 


Hence  the  second  and  third  terms  vanish.  Further  Xi=l  X^  —  ^^  while  7]=r. 
Hence  we  have 


<  3_y  2_  (1— 

^1  —  ' 


which  agrees  with  the  special  result. 


*  '  Biometrika,'  vol.  II.,  p.  276. 
C  2 
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In  any  other  case,  X2>  Xi"  1'  (/^4— {K—SX^^yX^^  will  probably  be  small  and 
thus 

Probable  error  of 

>?  =  -67449  nearly  (xxxiv.). 

This  simple  form  suffices  for  many  practical  cases. 

If  greater  exactitude  is  wanted,  there  is,  however,  no  great  labour  in  using 
(xxxiii.).    We  find  the  means  and  standard  deviations  of  each  array. 

Then  NX.  and  are  the  2"**  and  4"'  moments  of  the  means  of  these  arrays 
about  their  mean. 

N/Xg  and  N/x^  are  the  2"**  and  4"'  moments  about  the  mean  of  the  ^/-characters,  and 
will  always  be  known  for  skew  variation, 
is  defined  by 

and  can  be  easily  found  when  the  means  and  standard  deviations  of  each  array  have 
been  found. 

The  most  troublesome  expression  is     defined  by 


X2 


~  N(r/(1-V)VM ^xxxvi.). 


But  as  we  do  not  take  usually  more  than  10  to  20  arrays,  the  discovery  of  their 
3''^  moments  is  not  an  extremely  difficult  task.  As  a  rule,  however,  is  very  small 
and  may  be  fairly  neglected,  even  when  we  must  find      —  1-  these  points  will 

be  dealt  with  in  the  numerical  illustrations  given  later  in  this  paper.  At  present 
we  note  that  the  probable  error  of  r)  has  been  determined,  and  that  its  value  for  the 
general  case  is  not  really  more  complex  than  the  value  of  the  probable  error  of  r  in 
the  general  case,  which  requires  the  determination  of  product  moments  of  the  4**" 
order.* 

*  Let  Np^ys  =  S  {uxy  {x  -  x)i  (y  -  yY},  then  the  probable  error  of  r  is  given  by 

"'■""NX     pn^  '2p2oPo2  4^20^  4po2^  PnP20  PnPo2      J  "  v^^^"^"V- 

This  agrees  with  the  value  given  by  Sheppard  ('Phil.  Trans.,'  A,  vol.  192,  p.  128),  except  that  the  7-^ 
factor  has  been  dropped  by  a  printer's  error  in  his  paper.  For  the  special  case  of  a  normal  distribution,  we 
have  easily  from  the  equation  to  the  normal  surface 

Pio  =  3p2o\  i^o4  =  3_po2^  p3i  =  SpuP20,  p>i3  =  3puPo2,   {p22  -  ^Pn^)lp\\^  =  (}  -  r'^)lr~ 

and 

the  well-known  form  ('  Phil.  Trans.,'  A,  vol.  191,  p.  245). 
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(4.)  On  the  Higher  Types  of  Regression. 

'  We  have  already  seen  how  the  hitroduction  of  the  correlation  ratio  t)  enables  us  to 
drop  the  limitations  associated  with  the  Gauss-Laplacian  form  of  frequency,  and  the 
Bravais  correlation  formulae.  The  fundamental  step  towards  this  advance  was 
undoubtedly  taken  by  G.  U.  YuLE  in  his  paper  in  the  'Roy.  Soc.  Proc.,'  vol.  60, 
pp.  477  et  seq.,  wherein  he  shows  that  if  the  regression  be  linear,  the  Bravais  type  of 
formula  applied  to  multiple  correlation  is  still  true,  although  we  make  no  assmnption 
as  to  the  form  of  the  frequency  surface.  It  would  undoubtedly  l)e  a  gain  to  have 
skew  frequency  surfaces  which  would  describe  skew  correlation  for  the  great  mass  of 
cases  as  effectivly  as  the  series  of  skew  frequency  curves  describe  skew  variation,  but 
although  a  considerable  amount  of  progress  has  been  made  in  the  consideration  of 
these  surfaces,  their  full  theory  has  not  yet  been  worked  out  owing  to  difficulties 
of  analysis,  and  their  complete  discussion  must  still  be  postponed.  Yule's  method 
of  approaching  the  problem  from  the  form  of  the  regression  curves  is,  however, 
available  and  capable  of  very  great  extension.  Its  chief  advantage  is  that  it 
makes  little  or  no  assumption  as  to  the  distribution  of  frequency  ;  its  chief  defect 
lies  even  in  this  advantage  of  generality  :  it  does  not  enable  us  to  predict  the 
probability  of  an  individual  with  a  given  combination  of  characters.  This  follows  at 
once  from  the  fact  that  we  make  no  assumption  as  to  the  form  of  the  distribution 
within  an  array.  Without  some  theory  as  to  variation  within  the  array,  we  are 
reduced  to  the  laborious  process  of  calculating  the  standard  deviation,  skewness,  and 
other  general  characters  of  each  array,  a  lengthy  and  troublesome  process  compared 
with  a  theory  which  would,  like  the  Bravais  theory,  give  these  at  once  in  terms  of  a 
few  constants  determined  from  the  data  as  a  whole. 

In  the  great  bulk  of  biometrical  and  economical  enquiries,  however,  the  regression 
does  not  diverge  very  markedly  from  the  linear  form.  In  the  cases  of  non-linear 
regression  that  I  have  hitherto  had  to  deal  with,  I  find  that  parabolse  of  the  2"'' 
or  S^"**  order  will  suffice  as  a  rule  to  describe  the  deviation  from  linearity.  If 
they  did  not,  we  could,  of  course,  use  curves  of  higher  orders,  but  the  difficulty 
referred  to  in  the  first  section  of  this  paper  at  once  arises  :  we  then  need  to  use 
in  the  determination  moments  and  product-moments  of  such  high  orders  that  the 
probable  errors  of  the  constants  are  so  high  as  to  render  valueless  their  calculation 
from  such  statistical  data  as  we  can  hope  for  in  most  actual  inquiries.  In  the  great 
bulk  of  investigations  it  is  practically  impossible  to  increase  our  random  samples 
from  500  to  1,000  individuals  up  to  50,000  to  100,000.  Nor  in  the  great 
bulk  of  statistical  cases  is  any  such  increase  even  desirable,  for  a  fairly  wide 
experience  shows  that  2"''  and  3'''  order  parabolse  amply  suffice  to  describe  the 
skewness  of  the  regression  line.  I  shall  accordingly  classify  skew  correlation  in  the 
folio wino-  manner  :  — 
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{a. )  Liiiear  Regression : 

The  mean  of  an  .r-array  of  y's,  i.e.,  y^^,  is  given  by 

yi;'=c^(i-\-ci^Xp  (xxxviii.). 

(6.)  Parabolic*  Regression : 

The  mean  of  an  re-array  of  y's,  i.e.,  y^^,  is  given  by 

yx=aQ-\-a^Xp-\-a^x./   (xxxix.). 

(c.)  Cubical*  Regression  : 

The  mean  of  an  a;-array  of  y's,  i.e.,  y_r^,,  is  given  by 

y=c=af^-\-a^Xj,-\-acfCp^-\-a^Xp^   (xl.). 

It  is  conceivable — in  fact,  from  unpublished  work  already  done,  highly  probable — 
that  the  theory  of  skew  variation  will  give  regression  curves,  not  of  the  exact  form 
involved  in  (xxxix.)  or  (xl),  but  containing  product  terms  in  x  and  y.  The  most 
general  equation  to  a  regression  curve  may  be  taken  to  be  of  the  type 

and  what  experience  shows  us  is  :  that  for  the  great  bulk  of  vital  phenomena  it  is 
sufficient  to  expand  by  Maclatjrin's  theorem  and  keep  the  first  three  or  four  terms. 
Indeed,  in  the  large  majority  of  cases,  (xxxviii.)  alone  suffices.  Hence,  if  (xxxix.) 
or  (xl.)  fit  the  data  within  the  limits  of  random  sampling,  we  are  not  injudiciously 
circumscribing  future  developments  of  the  theory  of  skew  correlation  by  casting  our 
regression  curves  into  the  above  forms.  I  shall  deal  first  with  the  theory  of  cubical 
regression,  for  we  can  then  obtain  from  this  the  conditions  necessary  for  parabolic 
and  linear  regressions. 

I  must  remind  the  reader,  however,  that  the  form  of  the  regression  line  does  not  in 
any  way  limit  the  nature  of  the  distribution  of  the  array  about  its  mean  ;  the 
variability  of  an  array,  i.e.,  the  standard  deviation  of  an  array,  having  for  its  mean 
value  (Ty\/ 1 — rf,  may  or  may  not  be  the  same  for  all  arrays.  If  it  is  the  same,  or  all 
arrays  are  equally  scatteivd  about  their  means,  I  shall  speak  of  the  system  as  a 
homoscedastic  system,  otherwise  it  is  a  heteroscedastic  system.  The  Gauss- Laplacian 
correlation  surface  gives  a  homoscedastic  linear  system.  Mr.  Yule's  linear  regression 
is  not  necessarily  homoscedastic ;  it  may,  however,  be  homoscedastic  without  being 
normal,  and  then  the  scatter  of  each  array  is  measured  by  o-^  \/ 1  — r^.  When  a 
system  is  homoscedastic,  but  not  linear,  then  tr„^^-  =  cr/(l— i^^),  and  consequently  the 
)(i  of  (xxxv.)  is  equal  to  unity.    Xi  — ^  ^®  ^  necessary  result  of  homoscedasticity. 

Lastly,  we  want  a  word  to  express  the  idea  of  all  the  arrays  having  equal  skewness, 

*  '  Parabolic '  and  '  cubical '  are  here  used  in  the  narrower  sense  of  regression  curves  corresponding  to 
ordinary  parabolae  of  the  2"*  order  and  of  the  S''*  order  respectively :  in  both  cases  the  axis  of  the 
parabola  being  parallel  to  the  axis  of  the  y-character. 
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or  being  asymmetrical  in  an  equal  degree  about  their  means.  I  shall  express  this  by 
the  term  homocUtic ;  generally  the  arrays  will  not  be  equally  asymmetrical  round  their 
means,  and  in  this  case  we  shall  speak  of  them  as  heteroditic.  If  there  were  no 
skewness  in  any  of  the  arrays,  then  Wg  of  (xxxvi.)  would  be  zero  for  all  of  them. 
I  term  arrays  of  no  skewness  isocurtic,  and  skew  arrays  allocurtic.  If  we  supposed 
that  a  curve  of  Type  III.  would  sufficiently  express  the  skewness  of  an  array,  we 
should  have 

sk.=i-?>i3/o-,,,  ^ 

and  therefore  from  (xxxvi.) 

_  2S {n,,cr,,/(Sk.)(y.^-^)} 


For  a  homoscedastic  system  we  have  (r„^  =a-^\/l—rj^,  and  therefore 

2S{n.,(Sk.)(3/.,.-^)} 
 Nom"  ' 

and  for  a  homoclitic  system 

_2(Sk.)S{.^.,,cr,,/(y.~^-)} 
Na-/(1-.?TVM 

For  a  homoclitic  homoscedastic  system,  whether  isocurtic  or  allocurtic, 

_2(Sk.)S{n.,(y.~//)}_ 
 N^;^ 

Thus  X)i  is  to  a  certain  extent  a  measure  of  both  homoscedasticity  and  homoclisy. 
But  as  the  correlation  between  o-,,^  and  y.r^—fl  is  in  most  cases  extremely  small,  while 
the  skewness  of  the  array  can  well  change  its  sign  with  arrays  above  or  below  the 
mean,  we  can  fairly  consider  the  smallness  of  y-j  to  be  a  measure  of  the  approach  to 
homoclisy.  I  am  thus  inclined  to  speak  of  —  1  and  ^2  measures  of  heteroscedasticity 
and  heteroclisy.  When  they  both  vanish  we  have  a  homoscedastic  homoclitic  system. 
For  such  systems  77,  the  correlation  ratio,  tells  us  effectively  the  scatter  of  any  array, 
and  as  a  rule  all  we  want  to  know,  in  addition,  is  the  form  of  the  regression  line. 

(5.)  Cubiccd  Regression. 

We  have  already  used  the  following  notation 

Np,,,=S{v^.,,(.D-x)?(y-yK}  (xlii.). 

We  shall  shorten  our  formulae  if  we  write 

'^=Pu/{fJ-^o;j),    €=pJ{<T:?(ry),    C=P3Ao:My),    0=pj{(r^%)     .  (xliii.). 

We  have  already  used  /x.^  to  denote  p^jg,  and  we  shall  use  for  p^^.  Further,  we 
write 

^\-vilvi,    ^3=v>./,    Pz=v^vjv2\    ^^-vjv^  ....  (xliv.). 
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\/ ^^  =  vj(rj^  will  be  of  the  same  sign  as  v.^.    These  constants  /8  have  been  previously 
used  in  the  theory  of  skew  variation.* 
We  shall  further  put 

e=e-rv/;8;,    l=i-rP,,    d=:d-T^Js/J,    ....  (xlv.). 

The  regularity  of  the  forms  e,  ^,  6,  is  rather  screened  by  the  above  notation,  which 
is  introduced  for  brevity  ;  using  the  p^q  notation,  we  have 

^—Pi\P-2o—P\\PM,    1—P3iPiq—2ilP^,    0-P^\Pzo—PnPm  ^  _  (xlvi.), 
o-.-^o-y  (r,^a,/  a-,!'(Ty 

whence  the  law  of  formation  of  these  constants  is  easily  seen. 
The  regression  curve  may  now  be  conveniently  put  into  the  form 

y'~l^\-\.h^J''^-\-\ (■^>~^)'+63  {^p-^\    ....  (xlvii.). 


Or,  multiplying  by       and  summing  for  all  arrays, 

the  sign  of  vZ/S,  lieing  always  that  of  the  3'''  moment.  Hence,  measuring  from 
the  means  of  the  two  characters,  i.e.,  ^p=x^,—x,  Y,,.=-y,.—[j,  we  may  re-write  (xlvii.) 

Now  multiply  liy  /^,^X^yc^.,  and  sum  for  all  arrays,  remembering  that 

NV-0-..0-, = 8  (/i.  XY) = S  (5^X^Y.„), 
we  find  _ 

This  enables  us  to  get  rid  of      and  write  (xlviii.) 

Y,  Jcr,- rX,/o-.H- 6o{  (X,/o-.)^ -  v/i8i  (X./cr.)  - 1 } 

+  63{(XVo-.)^-;8aXVo-..)-v/A}     •    •    •  (^li^)- 
Now  multiply  by  n^^i^,,\(j.^"  and  sum  for  all  arrays.    We  have 

v/ A  +    {^^-^x - 1 )  +^^3  (^3/ v/)8i -^2 ^Jx -  v/^), 


or 


where 


<^3=(A-i8A-^i)/N/^i 

*  'Phil.  Trans.,'  A,  vol.  186,  p.  368,  and  A,  vol.  198,  p.  278. 


(li.). 
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Eliminating  b^,  we  can  write  (xlix.) 

YJa,=r  (X>.)+-f  {(X,/cr,.)^-  x/A  (X>.)- 1 } 
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(XJcr^)^-^,  (X>.)-  x//8,  -^^  [(X,/cr..)2-  v/^i  (X,/cr..)- 1 }]  .  (lii.). 


Now  multiply  by  n^^  (X^j/a-.,.)^  and  sum  for  all  arrays ;  we  find 


92 


or 

where 

It  follows  from  (1.)  that 


(C<^2-#3)/(Mt-<^3"^)  =  ^3  (liii-). 

b,={ici>,-C4>,)/{U,-ci>,^)  (iv.). 

We  can  thus  write  the  cubic  regression  curve  in  either  of  the  forms* 

*  The  method  is  perfectly  easy  of  extension,  if  we  choose  to  use  higher  products  and  moments,  to  a 
regression  curve  of  any  order,  e.f/., 


Yxjo-y  =  bo  +  hi  (Xp/o-.,)  +  h.2  (X^/o-^)2  + 

■  ■  .+b,,{X^/(r^y'  +  .  . 

For  let : 

Ne,!  -  S  (wa;!/Ya;,X/)/ {(Txla-y),    and    y,  = 

«'.K*  =  SKX/)/(N 

we  have: 

0  = 

bo    +  Oxbi  +    b2     +  yaha  +  . 

•    .     +  yJ'n  + 

«n  =  0  X  &o  +     ^1     +  ysh  +  Jih  +  . 

•    •      +7" +1^1  + 

€21  = 

bo   +   y^bi  +  yJ)-2  +  y;f>z  +  . 

•   •     +  yn+iK  + 

yph  +  yp+\b\  +  yp+obi  +  yp+ihz  +  . 

•     •       +  yn+pbn  + 

Hence  writing 

€oi  for  0,  7o  =  1,  yi  =  0,  72  =  1,  we  have 

bn  =  (eoi  Aon  +  <11  Ai„  +  «2i  A2„  +  .  .  . 

+      ^pn  +  •  .  0/^, 

where  A 

7o,      71,        72,         73,      •  • 

7„,  ... 

7i,       72,        73,         74,      •  • 

7«+i,     •    •  • 

72,         73,           74,            75,        •  • 

yn  +  2-      ■     •  ■ 

7p,    yp+\i    yp+2y     yp+ti    •  ■ 

yp  +  m 

and  Agji  is  the  minor  of  the  constituent  in  the  (^+1)""  row  and  +  column.  As  we  have  already 
noted,  however,  solutions  involving  anything  beyond  ye  are  hardly  likely  to  be  of  practical  value. 

The  value  above  for  b^  is  the  type  equation  given  by  the  method  of  least  squares,  when  we  strike  the 
best  fitting  curve  to  all  the  entries  in  the  correlation  table.  I  have  already  pointed  out  that  the  method 
of  moments  becomes  identical  with  that  of  least  squares,  when  we  fit  parabolae  of  any  order  ('  Biometrika,' 
vol.  I.,  p.  271).  The  retention  of  the  method  of  moments,  however,  enables  us,  without  abrupt  change  of 
method,  to  introduce  the  needful  rf,  and  to  grasp  at  once  the  application  of  the  proper  Sheppard's  correc- 
tions. The  extension  of  the  method  of  least  squares  to  continua  in  space  has  not  yet,  as  far  as  I  am  aware, 
been  fully  considered. 
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93 


+  it'"%r(V<-.^r'-^2(XA^^^^      /3,-^M(X>..)^-v/)8,(X>.,.)-l}]  .  (Ivi.). 

9294  —  93  1  92  J 


or 


Y.,K=r  (X>.,H  jt^~^t'2  v/y8i 

9294—93 
9294  —  93 

The  former  arrangement  of  the  solution,  while  it  is  apparently  more  cumbersome, 
is,  perhaps,  the  better,  for  it  gives  us  at  once  the  measure  of  the  deviation  from 

parabolic  or  2"''  order  regression,  i.e.,  the  approach  of  C(f>-2~^^s  to  zero.    In  the  case 

of  normal  correlation  both  e  and  ^  vanish,  and  neglecting  higher  terms  the  condition 

for  linear  regression  is  that  e  =  0,  and  {(^3—6^3  =  0,  or,  again,  e  and  {=0.  For 
material  in  which  the  .x-variability  is  isocurtic,  ^j  =  ^o=^o  =  0,  and  the  regression 
curve  takes  the  simple  form 

Y.,>,=r(X>.)+f  [(X^/a-.)'^-l]  +  |  {{X,/<T.f-p,{X,/cT.)}  .  (Ivi.)  ter. 

93  94 

We  now  turn  to  express  these  relations  in  terms  of  the  correlation  ratio  77. 
Multiply  (Ivi.)  by  n,rY.,Ja-y,  and  sum  for  all  arrays,  we  obtain 

^2^,.3+  ^  (e-^/M  +  {  ^-^2^-  ^        ^/Ar)  j , 

92  9294—93  I-  92  J 

whence  results 

(Ivii.)  is  a  necessary  condition  of  cubical  regression. 

It  is  of  course  not  a  sufficient  condition,  as  we  ought  to  show  that  64,  65,  &c.,  all 
vanish,  and  thus  any  number  ol  conditions  may  be  found.  For  example,  multiply  by 
n,^X//cr/  and  sum  for  all  arrays,  then 

9294—93  9294—98  VPl 

is  also  a  necessary  condition.  Here  fi^=VriV^j(T}^.  But  the  high  as  well  as  complicated 
value  of  the  probable  errors  of  such  expressions  renders  it  idle  to  consider  them  in 
practice. 
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Substituting  (Ivii.)  in  (Ivi.)  we  have  : 
^       929-1  —  93 


Which  sign  is  to  be  given  to  the  root  will  often  be  visible  on  inspection  of  the 
observations.    Otherwise  the  sign  of  the  root  must  be  the  same  as  that  of 

(lix.)  will  save  the  calculation  of  C  if  the  root-sign  can  be  found  by  inspection. 
Finally  there  is  a  third  form  into  which  we  may  put  the  cubic.    Eliminate  4'-2<t>i—4>s'' 
from  (lix.)  by  aid  of  (Ivii.)  and  it  becomes 

YJa-,  =  r{X,/a.)V^~Mt^  {{X,/ct:)'-VJ,  (X>,.)-1} 

'  ((X>.)3-^, {x,ia;)-VW,]  .  .  (ix.)- 

493—^93 

At  first  sight  this  might  appear  to  be  the  best  form  of  the  cubic,  because  it  does 
not  involve  the  6"'  moment  of  the  variable  x.    But  this  is  very  far  from  being  the 
case  in  actual  practice.    The  reason  is  simply  this,  e,  ^  and  rf — r~  are  in  most  cases 
very  small — they  vanish  in  normal  correlation — relatively  to      and         Hence  both 
numerators  and  denominators  of  the  coefficients  of  the  square  and  cubic  terms  are 
the  ratio  of  small  quantities,  and  accordingly  subject  to  large  probable  errors.  For 
this  reason  (Ix.)  was  found  in  actual  practice  to  be  of  no  service.    Of  the  other  two 
forms  (Ivii.)  and  (lix.),  which  neither  suffer  from  this  defect,  (^0^4 — <^-^  being  always 
large  relative  to  the  numerators,  (lix.)  while  involving  a  6*^  moment  does  not 
involve  a  4"'  product,  ^,  and  experience  shows  that  the  former  is  on  the  whole 
easier  to  determine  and  more  exact  than  the  former.    Hence  (lix.)  seems  the  prefer- 
able form,  even  if  it  be  needful  in  certain  cases  to  determine  t,  in  order  to  fix  the 
sign  of  the  radical.    The  cubic  regression  curve  thus  demands  a  knowledge  of  the 
correlation  ratio  17,  of  the  "  cubic  product  "  e  and  the  sign  by  inspection  or  calculation 
of  X<l)^  —  l<p^.     Besides  this,  we  require  the  first  six  moments  of  the  independent 
variable  x.    Of  course  if  the  regression  of  x  on  y  be  required,  as  well  as  that  of 
y  on  x,  the  second  correlation  ratio  and  cubic  product  as  well  as  the  first  six  moments 
of  y  must  be  found.    It  is  rare,  however,  that  both  regression  curves  are  needed  for 
a  single  enquiry. 

As  to  the  general  form  of  (lix.),  we  note  that  there  will  always  be  a  real  point  of 
inflexion  given  by 

^plo:=^{h4>z-'^)IMi)  '  '  ......  (ixi.), 

D  2 
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where   

and  further  that  there  may  be  two  points  of  horizontahty  given  by  a  certain  quadratic. 
Thus,  in  general,  the  regression  Hue  will  tend  to  be  part  of  an  S-shaped  curve.  The 
horizontal  points  may  be  imaginary,  or,  if  real,  either  they  or  the  point  of  inflexion 
may  be  far  beyond  the  portion  of  the  curve  which  crosses  the  observed  field  of 
frequency.  If  we  consider,  however,  the  slope  of  the  regression  curve  to  measure 
the  regression  in  the  neighbourhood  of  any  point,  we  note  that  the  regression  is  a 
maximum  at  the  point  given  by  (Ixi.),  and  grows  smaller  and  smaller  towards  the  two 
points  of  horizontality,  i.e.,  points  of  complete  local  independence  of  the  two 
characters.  These  are  not  unfamiliar  features  in  certain  practical  cases  of  skew 
correlation,*  and  accordingly  the  cubic  regression  curve  provides  us  with  a  ready 
means  of  describing  regression  phenomena,  which  cannot  be  dealt  with  by  the  simple 
line  or  the  parabola. 

It  may  of  course  be  suggested  that  a  quartic  ov  quintic  curve  would  give  a 
better  result  than  a  cubic.  The  answer  to  this  is  :  Possibly,  but  the  high  moments 
and  products  required  render  it  impossible  to  deal  even  superficially  with  the  probable 
errors  of  the  constants  involved.  The  calculation  of  the  probable  error  of  is  a 
sufficiently  stiff  task  in  the  general  case.  To  test  the  probable  error  of  a  condition 
like  (Ivii.),  to  say  nothing  of  one  like  (Iviii.),  would  involve  an  immense  amount  of 
work,  since  we  should  want  the  correlation  of  errors  in  rj,  e,  and  6.  Speaking  with 
some  experience  of  practical  statistical  possibilities,  I  think,  the  tendency  to  use  very 
high  moments  or  product-moments  must  be  curtailed  to  the  minimum  of  actual  needs. 
We  cannot  deny  the  existence  of  skew  vaiiation,  nor  of  the  sensible  curvature  of 
regression  lines.  We  must  admit  their  existence  as  the  result  of  statistical  experience. 
This  existence  involves  a  great  widening  of  the  old  frequency  notions  and  the  need 
for  a  new  means  of  description.  But  we  must  remember  that  statistics  are  essentially 
a  practical  study,  the  art  of  describing  by  a  few  numerical  constants  observational 
experience,  and  we  must  curtail  at  every  turn  the  desire  to  run  riot  in  mathematical 
formulae,  which  cannot  be  generally  applied  in  actual  practice,  t  Still  I  propose  later 
in  this  paper  to  deal  with  the  general  formulae  for  quartic  regression. 

(6.)  Parabolic  Regression. 
For  a  parabolic  system  63  must  vanish,  or  nearly  vanish.    Hence  we  have  from 
(liii.)  and  (Ivii.). 

C>^-e<^3=0  (Ixii.), 

(f,^{rj^^r")-p=0  (Ixiii.). 

*  Compare  for  example  the  regression  line  of  age  of  mean  age  of  bridegroom  for  actual  age  of  liride, 
which  gives  a  typical  S-shaped  curve.    See  '  Biometrika,'  vol.  H.,  p.  20. 
t  These  remarks  have  special  reference  to  the  points  dealt  with  on  p.  6. 
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From  these  conditions  we  find 

&3=e/(^2=±v/(r-^-W2- 
These  give  for  the  form  of  the  parabolic  regression  curve 

YJo^,,==r(X>.Hi-{(XA.^F-v/S(X>.)-l}    .    .    .  (Ixiv.), 

or   

Y.>,=r(Y,,Vo-.)±  ^5^^X>.)-~v/y8,(XVo-^^         .    .  (Ixv.). 

The  latter  form,  besides  the  correlation  coefficient  and  correlation  ratio,  requires  only 
a  knowledge  of  the  skew  variation  constants  ySj  and  ^83,  and  is  therefore  very  easy  to 
determine.  Except  for  very  nearly  linear  regression,  there  can  be  no  doubt  as  to  the 
sign  of  y/yf—r^,  as  we  can  tell  at  once  whether  the  parabola  ought  to  be  concave  or 
convex  to  the  .7^;-axis.  In  other  cases  the  sign  of  sj-rf'—v'^  must  be  taken  to  coincide 
with  that  of  e,  which  must  therefore  be  found.  It  will  then  be  as  easy  to  use  (Ixiv.) 
as  (Ixv.),  although  probably  17  and  r  can  be  found  with  less  error  than  e. 

It  is  thus  quite  easy  to  allow  for  such  curvature  of  the  regression  line  as  can  be 
expressed  by  a  parabola  of  the  2°'*  order  of  the  type  considered. 

We  notice  at  once  that  the  regression  curve  does  not  pass  through  the  mean  of  the 
two  characters.  Or,  an  individual  with  the  mean  of  one  character  will  most  probably 
not  have  the  mean  of  a  second  character.  This  is  a  rather  important  result,  which 
follows  at  once  for  nearly  all  types  of  skew  correlation. 

It  will  be  seen,  for  example,  that  Quetelet's  "  mean  man,"  defended  by  Professor 
Edgeworth  as  theoretically  justifiable,  depends  entirely  on  human  characters  giving 
linear  regression  curves.  Such  linear  curves  are  certainly  given  by  many  pairs  of 
characters,  cranial  and   body   measurements,  but  there  are  certainly  other 

characters  for  which  regression  ceases  to  be  sensibly  linear,  and  the  conception  of  the 
"  mean  man  "  in  this  case  fails.  For  example,  if  age  be  considered  as  a  character, 
then  the  regression  is  certainly  not  linear,  and  the  individual  of  mean  age  will  not 
necessarily  have  either  the  mean  physical  or  psychical  characters.  This  seems  of 
some  importance  for  the  general  conception  of  "  type,"  if  by  type  we  denote  the  mean, 
for  probably  there  are  other  characters  than  age  for  which  regression  is  skew. 

The  regression,  i.e.,  dYj.Jd^p  will  be  zero,  for  a  point  X(y,i,»x,)  for  which 

X^.=i{v/A-r^SH^|  (l^vi.) 

the  sign  of  the  root  being  determined  as  before.  Clearly,  therefore,  unless  r  be  very 
small,  or  rf  diverges  very  sensibly  from  r^,  this  point  of  zero  regression  may  correspond 
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to  a  very  large  abscissa,  and  in  some  cases  will  lie  entirely  outside  the  range  of 
observable  frequency. 

The  parabola  of  regression  cuts  the  line  of  regression,  i.e.,  the  line  of  best  fit  to 
the  series  of  regression  points,  or  to  the  means  of  the  a;-arrays,  in  two  points 
determined  by  the  quadratic  equation 

(^Y-x/^^-l  =  0, 

or 

f'=iW%±^K+^}    .......  (ixvii.). 

These  points  are  always  real,  and  correspond,  if  regression  be  truly  parabolic,  to 
the  same  values  of  the  x-character,  whatever  be  the  ^/-character  of  which  we  are 
considering  the  correlation.  In  the  case  of  normal  variation  of  the  .x-character 
only,"  these  are  the  points  of  inflexion  of  the  a;-distribution. 

(7.)  Linear  Regression. 

In  this  case  it  is  necessary  that  both  63  and  63  vanish  within  the  limits  of  random 
sampling,  and,  although  these  are  not  theoretically  sufficient — for  a  whole  series  of 
relations  between  the  higher  product- moments  could  be  written  down* — they  are  for 
practical  purposes  sufficient. 

Hence  we  have  the  following  conditions  for  linear  regression  : — 

■q^=r^  (Ixviii.), 

or,  the  coefficient  of  correlation,  without  regard  to  sign,  should  be  equal  to  the 
correlation  ratio.    Further  e  should  be  zero,  or 

'P-iiPm-lhi'Pm=^   (Ixix.). 

The  theory  of  linear  regression  is  so  familiar  that  it  need  not  be  further  discussed 
here.  In  the  actual  practice  of  statistics,  the  determination  of  the  means  of  the 
;c-arrays  and  the  drawing  of  the  regression  line  will  often  suffice  to  show  the  fairly 
trained  eye  whether  the  deviations  from  it  are  random  or  not.  If  they  are  not 
random,  then  we  must  proceed  to  the  determination  of  77  and  of  the  higher  product- 
moments. 

The  following  are  numerical  examples  of  skew  correlation,  selected  to  illustrate  the 
theory  developed  above. 

*  For  example,  it  is  necessary  in  most  cases  that  ^  should  vanish.  In  the  instance  of  that  very  special 
case  of  linear  regression,  the  Gauss-Laplacian  normal  frequency,  it  is  easy  to  show  that  the  constants  «,  ( 
both  vanish  as  well  as  t/^  = 
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Statistical  Illustrations. 

(8.)  Illustration  A. — On  the  Shew  Correlation  hetiveen  Number  of  Branches  to  the 
Whorl  and  Position  of  the  Whorl  on  the  Spray  in  the  case  of  Asperula  odorata. 

In  this  case  the  material  was  collected  in  a  lane  near  Horsham,  Sussex,  at 
"Whitsuntide,  1903,  by  Miss  M.  Radford.  There  were  150  independent  sprays,  the 
woodruff  had  just  flowered,  and  the  whorls  were  counted  from  the  flower  downwards. 
Being  early  in  the  season,  the  maximum  number  of  whorls  was  five,  and,  in  some 
cases,  not  even  as  many  were  available.  The  material  was  counted  and  tabled  by 
the  author,  and  the  results  are  exhibited  in  the  table  below : — 

Table  I. — Correlation  of  Whorl-Branches  and  Position  ot  Whorl. 


Whorl. 


First  . 
Second 
Third  . 
Fourth 
Fifth  . 


Totals. 


Number  of  branches  in  whorl. 


3 
3 
6 

12 
13 


37 


6. 


66 
61 
60 
68 
53 


308 


7. 


42 
47 
40 
39 
10 


178 


39 
39 
44 
22 
10 


154 


150 
150 
150 
142 
87 


679 


6-7800 
6-8133 
6-8133 
6-4859 
6-1724 


6-6554 


•8553 
•8437 
•9047 
•8780 
•8605 


ma- 


•7316 
•7117 
•8185 
•7709 
•7404 


1)13. 


•1535 
•0985 
•0383 
•1347 
•4049 


We  require  the  regression  curve  giving  the  probable  number  of  branches  for  a 
given  whorl. 

Dealing  first  with  the  skew  variation  in  position,  a  purely  arbitrary  system 
depending  solely  on  the  number  of  whorls  dealt  with  in  each  position,  we  find,  not 
using  Sheppard's  correction,* 


Mean  =  2-802,651, 
1-336,887, 


Hence  we  determine 


^2  =  1787,268, 
Vo^=  -311,783, 
^4=5-841,682. 


A=3 


017,027, 
828,767, 
085,545, 


j/5=  2-799,638, 
1/6  =  22-678,308. 

•811,740, 
•286,465. 
•610,879, 


972,295,   and  v//3i  =  + •130,487. 


*  The  numbers  are  tabulated  to  six  places,  because  we  cannot  be  sure  that  the  final  calculations  are  for 
the  data  true  to  two  places,  which  is  all  we  finally  retain  unless  this  is  done.  Any  number  of  figures  can 
really  be  retained  with  perfect  ease  when  the  work  is  done  on  a  calculator. 
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We  now  turn  to  the  skew  variation  in  the  number  of  branches  to  the  whorl,  and 
get  the  following  constants  : — 

Mean=6-655,375,  -806,124, 
(T^=  -897,842,       fi^=  -132,090, 
^^=1-138,410. 

The  values  of  y^^,  m^,  and      are  given  in  table  above.    Using  them  we  find 

o-M=-224,377,        r;  =  -249,911,        o-„,  =  o-,v/l  -  ^    =  -869,355, 
X2  =  o-m2= -050,345,       X^=-007,474,        Xi  =  "990>862,  xo= -•059,851. 
These  give  by  (xxxiii.),  showing  the  numerical  contribution  of  each  term, 

V=-|  {-878,991 --010,323- -000,888- •007,231H--013,578}, 

or  the  probable  error  of  r)=  '0242. 

Had  we  calculated  the  probable  error  of  rj  from  (xxxiv.),  we  should  have  found  for 
its  value  -0243.  It  is  clear  that  for  this  special  case  the  simple  formula  (xxxiv.)  is 
amply  sufficient,  the  small  terms  almost  cancelling. 

We  see  that  Xi  is  almost  unity,  and  the  graph  of  a-^Ja-y  shows  indeed  that  the  system 
is  sensibly  homoscedastic.  Xs  is  small,  but  a  glance  at  the  graph  of  the  clitic  curve 
on  Diagram  I.  shows  that  we  can  hardly  treat  the  system  as  homoclitic,  the  changes 
in  the  skewness  forming  a  fairly  uniform  curve.* 

For  practical  purposes,  we  may  treat  the  variability  of  the  number  of  branches  in 
any  array  as  sufficiently  closely  given  by  a-y  \/  \ — yf. 

We  now  turn  to  the  product-momentsf  and  find 

Pji  =  — -249,160,  i?3i=—  -896,415, 
j92i=- "236,289,       j94^=  — 1-210,225. 

*  Throughout  these  illustrations  the  clitic  curve  is  plotted  by  calculating  the  skewness  of  the  arrays 
from  |m3/(m2)3'2.    See  p.  23. 

t  In  calculating  these  products  referred  to  the  centroid  from  those  referred  to  any  axes,  generally 
corresponding  to  whole  numbers  in  the  table,  the  following  reduction  formulae  will  be  found  useful 
We  take  Nllgy/  =  S  {uxy  a;'^*'),  x  and  y'  being  measured  from  any  axes,  further,  x,  y'  are  the  distances  of  the 
means  from  these  axes,  and  vo,  I'g,     the  moments  of  the  a>character  about  its  mean  as  tabled  above. 

P\\  =  IIii  -  ^'IIoi,      i?2i  =      -  2«'nii  +  x-'^IIoi  -  y'v2, 

Pn  =     -  4*'n3i  +        +  ix'niu  +  .e'^noi  -  yvi. 

The  ^'s  should  be  further  corrected  for  grouping  by  Sheppard's  corrections  (given  on  my  p.  36),  provided 
there  be  high  contact  at  the  contour  of  the  surface  of  frequency.    Sheppard's  corrections  have  not  in  this 
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These  lead  to 

r=_-207,579,    e=--120,164,    {=--038,241,  ^=--285,890. 

Thus  all  the  constants  are  determined. 
We  find 

,^2_r-  = -019,367, 
(>7^-r'^)-e^=-001,281, 
<^3(^'-^')-^'-(?<^2-#3)V(<^A-<^/)  =  -000,276. 

These  should  be  respectively  zero  for  linear,  parabolic,  and  cubical  regressions.  It 
will  be  seen  that  they  are  satisfied  with  increasing  closeness  ;  we  might  well  be 
satisfied  even  with  the  parabolic  regression  curve.  The  following  are  the  regres- 
sion curves  determined,  being  the  actual  number  of  branches  in  the  whorl 
(  =  6-655, 375+Y^. ),  and  x^,  the  actual  position  of  the  whorl  : — 

[a. )  Straight  line : 

i/,^^=7-046,087- -139,408  a-^. 

(6.)  Parabola  from  (Ixv.) : 

2/^^=6-794,052--125,872.x^--077,592a;/  ; 

or, 

2/^_=:6-853,561- -077,592  {a;^-l-991,535)2. 

This  clearly  gives  a  maximum  number  of  branches,  6-8536  corresponding  to 
a;^=r9915,  a  value  within  the  limits  of  observation, 
(c.)  Cubic  from  (lix.) : 

2/^^=6-799,399--192,439X^,--084,230X/+-020,915X/. 

Here  is  measured  from  the  mean  position=.Xjo— 2-802,651,  and  t/^^  is,  as  before, 
the  total  number  of  branches  for  the  given  position. 

Condition  (Ivii.)  is  so  closely  satisfied  that  we  shall  here  get  sensibly  as  good 
results  from  (lix.)  as  from  (Ivi.), 

In  the  table  below  and  in  the  curves  of  Diagram  I.  the  values  of  the  mean  of 
the  arrays,  as  found  from  line,  parabola,  and  cubic,  are  given  and  compared  with 
observation. 

case  been  used,  as  this  condition  is  not  fulfilled.  The  axes  x',  y  actually  taken  for  woodruff  were  those 
through  the  third  whorl  and  through  six  branches. 

An  obvious  warning  about  the  signs  of  the  sums  of  the  products  may  be  given  which  may  save 
computators  some  trouble.    The  axes  being  taken  positi\'e,  as  in  the  accompanying 
figure,  then  the  sums  of  the  products  for  IIn  and  IIsi  are  positive  in  the  P'  and 
S'*,  negative  in  the  2"*  and  4"'  quadrants.    For  1121  and  1141  they  are  positive 


3ra 

4tii 

2nd 

.  y  ______  _J.  y 

in  the  1"  and  4""  quadrants  and  negative  in  the  2"*  and  3"*  quadrants.    In  • 
the  figure  the  axes  are  taken  so  as  to  suit  the  x  and  y-directions  of  the  table  on  ^  ^ 

p.  31.    Care  must,  of  course,  be  paid  to  this  point.    The  products  may  also 

be  found  from  the  y^/s  in  the  manner  indicated  on  p.  35,  footnote.    They  were  thus  verified  in  this  case. 

E 
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Table  II. — Mean  Branches  to  each  Whorl. 


Xp  — 

0. 

1. 

2. 

3. 

4. 

5. 

6. 

y.jc^  from  line  .... 
y^^     „    parabola  .    .  . 
„    cubic  .... 
Observed  .... 

[7-046] 
[6  •  546] 
'6-117] 

6-907 
6-777 
6-750 
6-780 

6-767 
6-854 
6-889 
6-813 

6-628 
6-775 
6-758 
6-813 

6-488 
6-541 
6-443 
6-486 

6-349 
6-151 
6-192 
6-172 

6-210 

5-  607 

6-  007 
1 

I  think  we  may  safely  say  that  in  the  relationship  of  branches  to  position  of  the 
whorl  in  woodruff  we  have  a  case  of  homoscedastic  correlation,  which  is  effectively 
described  by  a  parabolic  regression  curve.  Thus,  in  a  case  of  this  kind,  it  is  only 
needful,  besides  the  moments  up  to  the  fourth  of  the  x- character,  to  find  the 
correlation  coefficient  r  and  the  correlation  ratio  rj. 


(9.)  Illustration  B. — On  the  Correlation  between  Age  and  Head  Height  in  Girls. 

The  data  for  this  are  taken  from  my  School  IVEeasurement  series,  and  involve  the 
auricular  heights  of  2272  girls  between  the  ages  of  3  and  22.  There  was  considerable 
paucity  of  material  at  the  extreme  ends  of  the  range,  and  accordingly  as  our  correlation 
curves  are  all  obtained  by  weighting  the  observations,  we  can  hardly  expect  good  fits 
near  3  or  22  years  of  age.  The  actual  correlation  table  is  given  as  Table  III. 
Sheppard's  corrections  were  applied  throughout,  and  the  unit  of  height  is  2  millims. 

In  the  first  place  the  means,  standard  deviations,  and  3'"'^  moments  of  all  the  arrays 
of  heights  for  different  years  of  age  were  determined.  These  are  given  at  the  foot  of 
Table  III.,  but  in  actually  calculating  the  constants  more  places  of  decimals  were 
used.  Then  the  first  six  moments  of  the  frequency  of  the  ages  were  found  and  the 
first  four  moments  of  the  height  frequencies.  These  are  the  x  and  ^/-frequencies. 
They  give  us  : — 


Table  III. — Correlation  between  Age  and  Auricular  Height  of  Head  in  Girls. 


Age.  i 

10  joM  page  6 

Totals. 

3-4. 

4-5. 

5-6. 

6-7. 

7-8. 

8-9. 

9-10. 

10-11. 

n-12. 

12-13. 

13-14. 

14-10. 

15-16. 

16-17. 

17-18. 

18-19. 

19- 

.iO. 

20-21. 

21-22. 

22-23. 

niilliiiis. 
102  ^o-lOi  -25 

— 

1 

1 

— 

— 

— 

— 

— 

— 

— 

— 

— 

— 

— 

- 

- 

- 

- 

- 

2 

millims. 
102  -26-104  -26 

J      104  -25-106  -25 

— 

— 

— 

2 

— 

1 

1 

1 

1 

2 

— 

— 

— 

- 

- 

- 

- 

- 

10 

104  "23-106 "25 

— 

— 

1 

— 

1 

— 

1 

— 

4 

— 

2 

— 

1 

— 

- 

- 

- 

- 

- 

10 

106  "26-108  -26 

108  "25-110  "25 

— 

— 

— 

1 

5 

2 

1 

4 

2 

4 

1 

3 

1 

- 

- 

- 

- 

27 

108  -26-110  -23 

110  "25-112  "25 

— 

— 

1 

3 

1 

5 

12 

3 

0 

6 

3 

9 

2 

4 

1 

- 

- 

- 

56 

110  -23-112  -26 

112  "25-114  "25 

- 

- 

1 

— 

4 

3 

10 

8 

G 

9 

4 

3 

5 

5 

1 

- 

- 

- 

- 

59 

112  -26-114  -25 

114  "25-116  "25 

1 

- 

3 

4 

7 

8 

IS 

14 

11 

10 

10 

7 

6 

8 

2 

2 

1 

- 

- 

115 

114  "26-116  "25 

116  "25-llS  "25 

- 

2 

2 

9 

9 

7 

10 

23 

15 

18 

13 

9 

11 

6 

4 

3 

- 

1 

- 

142 

116  -25-118 "25 

118  "25-120  "25 

2 

2 

4 

13 

22 

24 

23 

37 

44 

23 

11 

19 

6 

6 

3 

! 

1 

- 

- 

244 

118  "25-120 -25 

120  "23-122  -25 

- 

2 

3 

6 

9 

19 

25 

29 

34 

41 

32 

21 

15 

13 

9 

4 

— 

3 

- 

265 

120  -25-122  "25 

td 

122  "25-124  "25 

- 

- 

3 

3 

7 

17 

23 

34 

38 

33 

21 

22 

18 

25 

9 

4 

I 

2 

- 

1 

261 

122  "25-124  "25 

124  "25-126  "25 

- 

- 

— 

1 

6 

19 

18 

33 

29 

40 

32 

23 

2U 

14 

13 

10 

1 

1 

- 

263 

124  "25-126 -26 

t  of  ] 

m  "25-128  "25 

- 

— 

1 

0 

9 

10 

8 

21 

27 

27 

32 

20 

18 

16 

13 

9 

- 

— 

1 

219 

126  -25-128  -26 

p 

128  "25-130  "25 

- 

"- 

- 

- 

- 

6 

9 

17 

10 

20 

39 

25 

29 

16 

11 

7 

1 

1 

- 

197 

128  -25-130  -25 

130  "25-132  "25 

- 

- 

- 

1 

3 

5 

7 

13 

17 

17 

15 

18 

13 

6 

6 

1 

1 

1 

- 

131 

130  -26-132  -23 

132  "25-134  "25 

— 

— 

— 

— 

1 

- 

7 

8 

10 

13 

8 

5 

10 

7 

7 

6 

— 

— 

— 

88 

132  -26-134  -26 

134  "25-136  "25 

1 

1 

3 

4 

4 

9 

11 

13 

9 

11 

2 

77 

134  -26-130  -25 

136  "25-138  "25 





_ 

— 

3 

2 

2 

10 

4 

14 

6 

3 

2 

I 

- 

— 

— 

52 

130  "26-138  "25 

138  "25-140  "25 

2 

3 

3 

2 

2 

2 

4 

2 

20 

138  "23-140  -25 

140  "25-142  "25 

- 

- 

- 

- 

- 

- 

1 

2 

1 

2 

4 

2 

2 

1 

I 

- 

- 

- 

16 

140  -25-142  -25 

142  "25-144  "25 

1 

2 

3 

4 

1 

11 

142  -25-144  -23 

144  "25-146  "25 

- 

- 

- 

- 

- 

- 

- 

- 

- 

1 

1 

- 

4 

144  -25-146  -23 

146  "23-148  "25 



] 

I 

- 

1 

146  -23-148  -25 

1 

7 

18 

40 

76 

125 

177 

235 

261 

309 

263 

198 

214 

163 

95 

61 

1. 

i 

7 

8 

2 

2272 

Totals. 

Means  1 
l-milliin,  units  J 

115  "2500 

110 "9043 

117  "4722 

119  "1000 

120  "3020 

121  "0310 

121  "7246 

122  "8100 

123  11.27 

123  "8938 

12 1  "8032 

125  ■71-10 

120-1505 

120  -53.« 

120  -9133 

127  "0305 

129  "t 

577 

123  -8214 

120  -5000 

125  "2500 

124  '0467 

r  Meaus 

1  . 

L         l-uiillim.  units. 

Standard  deriation  1 
2-millim.  units  J 

0 

2  "8853 

2  "9270 

2  "9641 

2  "9882 

2 "63G0 

3  "3877 

2  "9653 

3  -2089 

3  "2061 

3  "3.589 

3  "5805 

3  "4<303 

3  -8696 

3  -1679 

3  "1335 

4  "8400 

2  -3311 

4  "1414 

0  -9374 

3  "4541 

r       Standard  dcTiation 
(_        2-millim.  units. 

Third  momenta  1 

i°  1 
2.imlliia.  unite  J 

0 

-  42  "822 

-  18  "108 

-    7  079 

f     1 "782 

-    0  "171 

+  15-893 

+     2  "330 

+     0  -238 

+     8  ^lO 

-    7  "280 

+     3  "015 

-    0  016 

+     9  "379 

+     3  -991 

+    0  "070 

-  29-164 

_    2  -739 

+  85  -816 

0 

+     5  -200 

r        Tliird  moments 
[        2-miIlim.  units. 
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Height  Constants. 


Age  Constants. 


Mean  height  = 

124-0467  milHms.  Mean 

age  = 

12-7007 
3-064,819" 

ay  = 

3-454,125  1 

9-393,110 

/^3  = 

11-980,977  I 

>  2  milhm. 

5-206,247  1 

1  units. 

1/3= 

1-051,882 
239-157,055 

^4  = 

438-639,633  j 

1/6  = 

104-298,702 
9536-265,059 

-015,960, 

/8i  = 

•001,335, 

/3'o  = 

3-081,454, 

R  — 

P3  — 

2-710,593, 

Further 

11-506,681, 

2-093,366  millims. 

+  -036,538, 

\  = 

4-382,181 1  in  1  miUim. 

1-709,258, 

K= 

62-399,135  J  units. 

<^3  = 

-250,123. 

Hence 

(\,-3X,2)/(4X/)  = 

-062,340, 

<^,= 

4-158,032. 

in 


units. 


In  the  next  place  the  products  were  worked  out  and  referred  to  the  means  with 
the  following  results  : — * 

^11=       3-113,712,  whence  r=  -294,128, 

-     1-957,022,  e=  — -071,065, 


^^21=- 

p,,=  74-447,616, 
|)^^=  — 108-701,559, 


^=-•048,576, 
^=  —  •470,126. 


Further,  from  Sm,    17= -303,024. 

In  deducing  the  product-moments  after  they  had  been  referred  to  the  means,  the 

*  These  products  were  in  this  case  (as  in  all  other  cases)  verified  by  calculating  from  the  means  of  the 
arrays  yx^,  the  expressions 

Of  course  it  is  easiest  to  calculate  these  products  about  some  arbitrary  origin  coinciding  with  the 
abscissa  of  one  array.    If  these  products  be  then  p'n,  p'21,  p'zi,  p'u,  and  x  be  the  mean,  we  have 

Pn=p'n, 

P21  =p\\  -  2xp'n, 

Psi  =p'z\  -  3x>'2i  +  Sx'Yn, 

Pii  =  P'tt  -  i&'p'zi  +  6x^21  -  .... 
E  2 
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proper  Sheppard's  corrections  were  introduced.  These  are,  if  )Pu\,  {Pn],  {psi}, 
\2hi  \  represent  the  iiiicorrected  moments  :— 

Psi  =  {Psi  }-¥>Pn}>       Pii  =  [Pii ]  -i (i^i ] , 

the  units  of  grouping  being  the  units  throughout. 
From  the  constants  for  the  arrays,  I  found 

Xi-l  =  --000,675,  X2=-"007,198. 

Whence  the  probable  error  of  17  was  determined  by  (xxxiii.).    Its  value  was* 

Probable  error  of  77  = -012,913. 

If  found  from  the  simple  formula  -67449  (l—^y^yN,  the  value  is  -012,851.  We 
accordingly  are  again  forced  to  the  conclusion  that  77  may  for  practical  purposes  be 
found  from  this  simple  formula,  instead  of  the  complicated  result  (xxxiii.).  Although 
both  Xi~l  X-2  small,  it  is  very  doubtful  whether  we  can  legitimately  consider 
the  system  as  homoscedastic.  The  dotted  line  ab  of  Diagram  II.  would  fairly  well 
represent  increasing  variability  with  age.  The  skewness  of  the  arrays  is  relatively 
small  and  changes  sign  so  frequently,  that  we  can  certainly  not  attribute  any  law  to 
such  heteroclitic  tendencies  as  there  are.  They  are  probably  due  to  errors  of  random 
sampling  from  truly  isocurtic  material. 

It  will  be  seen  that  the  height  frequencies  with  ^\  =  '0160  and  fi'^—S'OSld  do  not 
difter  very  much  from  a  normal  distribution  ;  in  fact,  we  can  lay  no  stress  on  the 
heteroclisy  of  the  system  at  all.  But  the  values  of  the  standard  deviations  of  the 
arrays,  or  the  graph  of  (t„,I<t,),  certainly  shows  increasing  variation  with  increasing  age, 
a  phenomenon  with  which  one  is  familiar  in  a  variety  of  other  human  characters,  t 

This  heteroscedasticity,  due  to  increasing  variation  with  growth,  would  hardly  have 
been  anticipated  from  a  mere  inspection  of  the  smallness  of  Xi  \  it  is  somewhat 
obscured  by  the  irregular  values  of  the  standard  deviations  of  the  small  arrays  at 
the  adult  end  of  the  age  range.  The  mean  value  of  the  standard  deviation  of  the 
weighted  arrays  is  o-^  \/ 1  — 17^  =  3-2992  in  2-millim.  units. 

We  now  turn  to  the  regression  curves  to  see  how  far  the  conditions  for  the 
different  types  are  satisfied.     We  have 

^2_^3_.005,312, 
<^3(773-r2)-e^= -004,030, 

(>?'-^^-^)-^-^-a<^.-^<^3)V(Mi-<^3')=-ooo,604. 

*  The  contributions  of  the  successive  terms  of  (xxxiii.)  are  in  fact  given  by 

2^2  =  i {-824,785  +  -001,870  +  -004,673  -  -000,472  +  -001,888 }. 

t  See  Pearson  :  '  The  Chances  of  Death  and  other  Studies  of  Evolution,'  vol.  I.,  pp.  296,  307, 
310,  314. 
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But  the  first  should  be  zero,  if  the  regression  be  Hnear ;  the  second,  if  it  be 
paraboKc  ;  and  the  third,  if  it  be  cubical. 

We  see  increasing  approximation  to  fulfilment  of  the  several  conditions.  Referred 
to  axes  through  the  mean  age  and  head  height,  the  following  are  the  regression 
curves*  : — 

(a.)  Straight  line : 

Y,^= -662,979  X^. 

(6.)  Parabola  (from  equation  (Ixv.)) : 

Y,^=-055,749  +  -667,570X^,--041,00lX/. 

(c.)  Cubic  (from  equation  (Ivi.)) : 

Y,^=  •280,194  +  -722,886  X^- -029,580  X/- '002,223  X/. 

(c'.)  Cubic  (from  equation  (lix.)) : 

Y,.=-296,076  +  -812,249X/;  — -028,004  X/--005,740X/. 

(c')  will  not  give  as  good  results  as  (c),  for  it  depends  on  a  use  of  the  condition 
(Ivii.)  which  is  not  absolutely  fulfilled. 

The  following  table  gives  the  values  in  the  case  of  the  four  curves  : — 


Table  IY. — 2/,„=Mean  Auricular  Height  of  Girl's  Head  at  Given  Age. 


Xp  =  a.ge. 

Regression  line. 

Regression 
parabola,  t 

Gallic  {r). 

Cubic  (c'). 

Observed. 

3-5 

117-95 

114-49 

116-90 

118-94 

115-25 

4-5 

118-61 

115-87 

117-66 

118-94 

116-96 

5-5 

119-27 

117-17 

118-42 

119-16 

117-47 

6-5 

119-94 

118-39 

119-24 

119-57 

119-10 

7-5 

120-60 

119-52 

120-08 

120-14 

120-30 

8-5 

121-26 

120-57 

120-93 

120-84 

121-63 

9-5 

121-92 

121-55 

121-78 

121-62 

121-72 

10-5 

122-59 

122-43 

122-62 

122-45 

122-82 

11-5 

123-25 

123-24 

123-42 

123-26 

123-14 

12-5 

123-91 

123-97 

124-18 

124-15 

123-89 

13-5 

124-58 

124-61 

124-88 

124-95 

124-86 

14-5 

125-24 

125-17 

125-52 

125-65 

125-71 

15-5 

125-90 

125-65 

126-07 

126-22 

126-16 

16-5 

126-57 

126-05 

126-52 

126-68 

126-53 

17-5 

127-23 

126-36 

126-87 

126-93 

126-91 

18-5 

127-89 

126-59 

127-09 

126-96 

127-02 

19-5 

128-55 

126-75 

127-18 

126-74 

129-56 

20-5 

129  22 

126-81 

127-11 

126-22 

123-82 

21-5 

129-88 

126-80 

126-88 

125-38 

126-50 

22-5 

130-54 

126-71 

126-48 

124-28 

125-25 

*  Yx,,  is  here  measured  in  millimetres  and  Kp  in  years. 

t  The  maximum  ordinate  is  at  vertex  of  parabola,  i.e.,  ii;  =  8-1409,  or  age  20-84;  its  magnitude  =  126-82. 
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An  examination  of  this  table  and  the  graphs  on  Diagram  II.  seem  to  show  : — 
(i.)  That  cubic  (a)  is  considerably  better  than  cubic  (c'). 

(ii.)  That  we  do  get  a  sensible  betterment  in  passing  from  parabola  to  cubic,  and, 
accordingly,  that  we  must  use  in  this  the  cubic  to  effectively  describe  the  regression 
within  the  range  of  observation.  Probably  neither  cubic  nor  parabola  would  effectively 
serve  for  extrapolation  even  close  to  the  limits  of  observation. 

Thus  the  cubic  (c')  starting  at  3-4  with  its  point  of  inflection  is  clearly 
inadmissible,  and  the  drop  after  20  or  21  years  of  age,  shown  by  both  parabola  and 
cubic,  is,  of  course,  only  due  to  the  anomalous  character  of  the  few  girls  over  18  left 
in  the  schools.  Actually  the  shrinkage  of  measurements  does  not  begin  till  at  least 
26  years,  and  is  then  far  more  gradual  than  these  curves  indicate. 

But,  as  in  all  fitting  of  this  kind,  we  obtain  the  best  fit  we  can  within  the  range, 
entirely  at  the  expense  of  what  may  occur  just  outside  the  range.  For  this  reason, 
as  E.  Perrin"^  has  pointed  out,  a  good  interpolation  curve  is  usually  a  bad  extra- 
polation curve. 

We  might  sum  up  our  results  for  auricular  height  with  age  in  girls  by  saying  : 
That  the  correlation  is  non-linear,  effectively  cubic ;  heteroscedastic,  there  being 
increasing  variability  with  growth  ;  that  while  the  total  height  frequency  is  not  very 
far  from  normal  the  array  frequencies  are  slightly  heteroclitic,  but  so  very  irregular  in 
sign,  that  probably  we  are  dealing  with  a  case  of  isocurtic  homoclisy,  to  which  the 
sparsity  of  data  in  the  extreme  arrays  gives  an  appearance  of  anomic  heteroclisy. 

(10.)  Illustration  C. — 071  the  Skew  Correlation  between  Size  of  Cell  and  Size  of  Body 

in  Daphnia  magna. 

Dr.  E,  Warren  has  dealt  with  this  point  in  a  memoir  published  in  '  Biometrika,' 
vol.  II.,  pp.  255-9.  The  resulting  regression  curve  of  size  of  cell  for  given  size  of 
body  is  very  far  from  linear,  and  it  is  quite  clear  that  the  correlation  is  skew.  It 
has  already  been  noted  in  '  Biometrika '  that  the  relationship  is  considerably  obscured 
by  the  irregularities  produced  by  ecdysis.  Our  object  at  present,  however,  is  purely 
theoretical,  namely,  to  show  how  a  certain  system  of  constants  and  of  curves  describes 
the  actual  correlationship,  and  for  this  purpose  Dr.  Warren's  observations  form  as 
good  material  for  graduation  as  we  could  expect  to  find.  The  following  Table  V. 
gives  the  observations  with  the  working  scales  attached.  I  must  refer  to 
Dr.  Warren's  paper  (p.  256)  for  the  relation  between  the  units  of  grouping  on  the 
working  scales  and  those  of  the  actual  measurements  on  body  and  cell  lengths.  As 
far  as  correcting  the  raw  moments  is  concerned,  Sheppard's  corrections  were  used 
for  the  cell  sizes,  but  not  for  the  body  lengths,  because  the  number  of  individuals  in 
the  latter  case  was  perfectly  arbitrary  and  there  is  no  approach  to  high  contact.  The 

*  '  Biometrika,'  vol.  III.,  p.  99. 
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product  moments  were  also  uncorrected.  The  product  moments  were  found  in  both 
ways  (see  p.  35,  footnote)  and  the  results  thus  verified. 

Table  V.  gives  the  means,  standard  deviations,  and  third  moments  of  the  arrays  ; 
the  latter  are  all  small  and  superficially  irregular  in  sign.  I  think  we  may  say  that 
there  is  no  marked  and  continuous  heteroclisy.  On  the  other  hand,  I  think  we  may 
say  that  while  the  clitic  curve  deviates  to  and  fro  from  a  zero  base,  the  scedastic 
curve  would  fit  better  to  a  parabolic  curve  than  to  the  straight  line  which  is  its 
mean.  In  other  words,  the  variability  of  the  cells  increases  with  size  of  body  {i.e., 
growth)  up  to  a  certain  stage  and  then  decreases  again.  This  result  is  obscured  by 
the  fall  of  the  variability  after  each  ecdysis.  Roughly  the  ecdyses  produce  a  rhythm 
in  all  three  curves,  the  regression  curve,  the  scedastic  curve,  and  the  clitic  curve. 
When  the  means  of  the  arrays  are  above  the  regression  cubic,  then  the  ordinates  of 
the  scedastic  curve  are  above  their  mean  and  those  of  the  clitic  curve  show  positive 
skewness ;  when  they  are  below  the  regression  curve,  we  have  lessened  variability 
and  negative  skewness.  In  other  words,  the  ecdyses  are  accompanied  by  lessened 
cell  variability  and  negative  skewness  of  distribution.  I  think  we  may  state  that 
there  is  a  nomic  heteroscedasticity  due  to  growth  of  body,  giving  first  an  increased 
variability  with  growth  and  afterwards  a  decrease  with  age.  There  is  probably 
isocurtic  homoclisy.  Both  of  these  are,  however,  obscured  by  a  semi-rhythmic 
heteroscedasticity  and  heteroclisy  introduced  by  the  ecdyses. 

We  now  turn  to  the  constants  of  the  cell  and  body  length  distributions,  merely 
noting  that  all  these  constants  are  given  in  terms  of  the  units  of  the  working  scales. 

Cell  Constants.  Body  Length  Constants. 


Further 


cell  = 

9-268,657, 

Mean  body  length  = 

8-502,488, 

2-541,734, 

3-864,784, 

H= 

6-460,410, 

^2  = 

14-936,562, 

2-142,362, 

1/3  = 

-  5-125,806, 

123-921,496, 

432-769,533, 

-  425-276,682, 

15192-5375, 

-017,021, 

A  = 

-007,885, 

2-969,111. 

1-939,793, 

^3  = 

•043,796, 

4-559,091, 

1-454,600, 

-  -088,798, 

2-115,862, 

<f>2  = 

•931,908, 

x,= 

15-142,840. 

-  ^232,167, 

-095,615. 

-788,409. 
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We  have  next  the  product  moments  referred  to  the  means 

r=  -394,862, 
e=--281,831, 
C=  -098,578, 
^=-•759,344. 

Further,  from  Smj 

7^  =-572,287. 

From  the  constants  for  the  arrays  I  deduced 

Xi  —  1  =  — -108,148,  X2='088,323. 

These  are  higher  values  of  Xi  ~"  ^  X2  than  we  have  found  in  the  first  .two 
illustrations. 

We  now  obtain,  showing  the  contribution  of  each  term  of  (xxxiii.), 
;  ^"^"n  {'452,240--002,528  +  -010,803--013,180--027,875}. 

Whence  probable  error  of    = -67449  2,= -0097. 

Had  we  calculated  the  probable  error  of  17  from  (xxxiv.),  we  should  have  found  it 
equal  to  -0101.  The  difference  is  greater  than  in  the  two  previous  illustrations,  but 
is  only  -0004,  and  this  would  have  no  significance  in  any  practical  use  of  the  probable 
error.  We  again  conclude,  therefore,  that  (xxxiv.)  is  sufficiently  close  to  replace 
(xxxiii.)  in  practice. 

For  the  mean  standard  deviation  of  the  weighted  arrays  we  have 

o-„=o-^  x/r^=2-084,358. 
If  we  now  examine  the  criteria  for  the  nature  of  the  regression,  we  have 

r;^-r2=-17l,596, 
<^2(^3_r2)_e2= -080,483, 
<^2(V-'"')-^"'-&2-^<A3)V(<^2^4-'^3')= -079,457. 

We  should  conclude,  therefore,  that  linear  regression  is  inadmissible,  but  that 
parabolic  or  cubic  will  be  moderately  successful,  the  latter  not  very  much  better  than 
the  former.  Our  moderate  success  only  in  this  case  is,  of  course,  due  to  the  irregu- 
larity of  the  results  to  be  graduated,  the  influence  of  the  ecdyses  being  so  disturbing 
that  we  really  need  a  curve  periodically  varying  from  the  graduated  regression  curve. 

We  have  the  following  regression  curves  : — 

(a.)  Straight  line : 

Y— -259,687  X^. 

F 


^11=       3*892,863,  whence 
P2^=-  12-104,322, 
P3i=  127-348,064, 
^^^=-541-433,455, 
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(b.)  Parahola  from  (Ixv.) : 

Y.,^=  1  -097,690-1- -236,135  X^-  -073,490  X/. 

The  maximum  occurs  when  X^=r6066,  and  is  given  by  Y^_,=  1*2874,  thus  occurring 
within  the  limits  of  observation.* 
(c.)  Cubic  from  (lix.) : 

Y,,  =  -752,856  +  -193,058  X^- -049,817  X/-I- -001,710  X/. 

In  all  these  cases  Y^.^,  and  X^  are  measured  from  the  means  of  the  cell  and  body- 
lengths,  or  from  9-268,657  and  8-502,488  respectively. 

Table  VI.  gives  the  calculated  and  observed  results,  and  the  whole  system  is 
represented  in  Diagram  III,  Either  the  parabola  or  cubic  graduates  quite  well  the 
results,  allowing  for  the  periodic  deviation,  and  we  may  fairly  describe  the  system  as 
a  heteroscedastic  cubic  regression  with  isocurtic  homoclisy.  The  correlation  ratio  is 
very  sensibl)^  different  from  the  correlation  coefficient.  The  regression  cubic  does  not 
differ  widely  from  that  given  in  '  Biometrika,'  which  was  obtained  without  weighting 
the  means  of  the  arrays,  and  by  simply  striking  the  best  cubic  of  the  given  type 
through  the  points. 


Table  VI. — 1/3:^= Mean  Cell  Length  for  Given  Body  Length  in  Daphnia. 


a;^  =  body  length. 

Regression  line. 

Regression  parabola. 

Regression  cubic. 

Observed. 

1 

7-320 

4-458 

5-047 

5-300 

2 

7-580 

5-724 

6-190 

5-833 

3 

7-840 

6-842 

7-166 

7-790 

4 

8-099 

7-813 

7-986 

8-050 

5 

8-359 

8-638 

8-661 

9-473 

6 

8-619 

9-315 

9-200 

8-436 

7 

8-879 

9-846 

9-613 

8-596 

8 

9-138 

10-229 

9-912 

10-267 

9 

9-398 

10-466 

10-105 

10-761 

10 

9-658 

10-555 

10-205 

11-027 

11 

9-917 

10-498 

10-220 

10-953 

12 

10-177 

10-293 

10-161 

9-100 

13 

10-437 

9-942 

10-038 

9-000 

14 

10-696 

9-443 

9-861 

10-036 

15 

10-956 

8-798 

9-642 

10-317 

(11.)  Illustration  J). — On  the  Skew  Correlation  between  Number  of  Branches  to  the 
Whorl  and  Position  of  the  Whorl  on  the  Stem  in  Equisetum  arvense. 

I  have  selected  this  example  not  on  account  of  any  biological  importance,  because 
the  material  is — especially  with  regard  to  the  first  and  last  two  whorls — unsatisfactory 
either  on  account  of  irregularity  or  of  insufficiency  of  material.    It  has  been  taken 

*  Actual  values  on  working  scales,  a;„  =  10-1091  and     =  10-5560. 
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purely  from  its  statistical  interest,  because  it  gives  a  series  with  markedly  skew 
correlation,  having  a  regression  curve  of  a  rough  S-shaped  character.  If  we  omit 
the  first  and  last  whorls,  we  get,  as  I  have  already  shown,*  a  remarkably  close  fit 
with  a  cubical  regression  curve.  My  present  object,  however,  is  not  to  consider  aiiy 
law  of  growth,  but  merely  a  mass  of  statistical  material,  to  be  dealt  with  by  the 
processes  of  the  present  paper. 

We  may  anticipate  that  the  irregularities  of  the  series,  indicated  in  the  memoir 
just  referred  to,  will  make  themselves  manifest  in  a  less  satisfactory  fitting  of  the 
regression  curve  than  occurs  when  we  deal  with  the  more  homogeneous  group  ot 
equally  weighted  whorls  fitted  in  the  diagram  of  that  paper.  Table  VII.  gives  the 
data,  with  the  means,  standard  deviations,  and  third  moments  of  each  array. 

The  axis  of  x  shall  be  taken  to  give  the  position  of  the  whorl  on  the  stem  and  that 
of  y  to  denote  the  number  of  branches.  We  require  the  regression  curve  of  y  on  x, 
or  the  probable  number  of  branches  on  a  whorl  in  a  given  position.  We  shall  not 
use  Sheppard's  corrections  for  the  moments  of  either  the  x  or  characters,  as  high 
contact  certainly  does  not  hold  for  both  at  the  low-value  ends  of  their  ranges. 

We  have  the  following  constants  : — 


Position  Constants. 


Branch  Constants. 


it  ion  = 

6-403,315, 

Mean  number  of  branches  = 

7-216,851, 

3-542,604, 

3-278,499, 

12-550,046, 

H-2  = 

10-748,557, 

^3  = 

8-249,534, 

/^3  = 

-  24-313,478, 

»'4  = 

319-515,824, 

/^4  = 

245-811,660, 

»'5  = 

644-095,176, 

I/g  = 

11203-5814, 

-034,429, 

•476,044, 

2-028,625, 

/3'2  = 

2-127,658. 

^83= 

-214,190, 

Further 

^4  = 

5-667,884, 

2-789,949, 

-185,550, 

7-783,815, 

-994,196, 

140-441,685. 

h  = 

-592,384, 

Hence 

1-518,136. 

(X,-3X/)/(4V)= 

-•170,503. 

We  have  next  the  product  moments  referred  to  the  means 

*  'Proc.  Roy.  Soc.,'  vol.  71,  p.  308. 
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—    8-225,585,  whence     r=  —708,222, 

^31=  -  21-471,321,  €=  --390,436, 

^31=  -205-084,042,  C=  +-029,733, 

p^^=  -917-984,938,  6=:  --960,212. 


Further,  from  S 


ry  =  -850,984. 
From  the  constants  for  the  arrays  we  deduce 

X^-l=--S56,367,       x-3= -'3^2,952. 
We  now  obtam,  showing  the  contribution  of  each  term  of  (xxxiii. ), 

^^3=  ^{•076,080--157,932+-055,359  +  -079,662  + -038,579}. 
Whence  probable  error  of  17= -67449  S,=  '0054. 

Had  we  calculated  the  probable  error  ot  17  from  (xxxiv.)  we  should  have  found  it 
equal  to  -0049.  The  difference  -0005  is  not  of  importance  for  practical  purposes. 
Yet  in  this  case  it  is  clear  that  the  values  of  —  1  and  Xi  are  very  sensible.  Thus  we 
see  that  a  very  marked  heteroscedastic  and  heteroclitic  system  with  continuously 
changing  standard  deviation  and  skewness  scarcely  affects  for  practical  purposes 
(i.e.,  to  three  significant  figures)  the  probable  error  of  17.  All  four  of  our  illustrations 
therefore  confirm  the  conclusion  that : 

For  practical  purposes  the  prohable  eri'or  of  the  correlatio7i  ratio,  rj,  may  be  taken 
as  -67449  (l-r7^)/N. 

Our  Diagram  IV.  gives  the  values  of  the  relative  standard  deviations  of  the  arrays, 
or,  (Tnja-y,  the  horizontal  line  giving  \/l  — 17^= '5252,  or  the  mean  value  of  the  relative 
standard  deviations  of  the  weighted  arrays.  We  have  also  the  clitic  curve  giving 
|-\/ for  each  array,*  The  remarkable  smoothness  of  these  scedastic  and  clitic  curves 
in  this  case  indicates  how  far  certain  types  of  correlation  surfaces  diverge  from  pure 
normality  of  distribution,  the  divergence  being  obviously  nomic. 

We  now  turn  to  the  regression  curves  and  write  down  the  conditions  for  the 
different  types ;  the  three  expressions  should  be  zero  for  linear,  parabolic,  and 
cubical  regression  respectively 

^2_^2_ -222,596, 
<^3('>72-r2)-e3= -068,864, 

*  \JP\=  difference  between  mode  and  mean  divided  by  standard  deviation  =  skewness  in  the  case  of 
skew-curves  of  Type  III,  ('  Phil.  Trans.,'  A,  vol.  186,  p.  373),  and  may  be  taken  as  a  reasonable  measure  of 
the  skewness  for  those  cases  in  which  the  fuller  form  involving  would  involve  too  laborious  calculations. 
If  in  equation  (xii.)  of  the  present  memoir  we  put  /^a  =•  3  +  a  small  quantity,  and  remember  that  /?i  is  itself 
a  small  quantity,  we  see  that  the  more  correct  formula  for  the  skewness  involving  yQo  reduces,  neglecting 
terms  of  2'"'  order,  to  \  'JWi- 
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We  see  at  once  that  the  straight  line  is  inadmissible,  the  parabola  will  not  be  very 
good,  and  the  cubic  only  moderately  appropriate.  The  conditions  are  not  nearly  so 
closely  fulfilled  as  in  the  cases  of  woodruff  and  head  heights ;  the  last  two  are  better 
than  in  the  case  of  Daphnia  cells,  but  while  the  deviations  in  the  case  of  Daphnia 
were  irregular,  there  being  no  approximate  smoothness  in  the  scedastic  or  clitic 
curves,  we  shall  find  here  more  uniform  deviations  which  would  probably  be  partially 
allowed  for  by  a  quartic  regression  curve. 

The  following  are  the  regression  curves  : — 

(a.)  Straight  line  : 

Y.^=- -655,423  X^. 

(b.)  Parabola  from  (Ixv.) : 

Y,^=l-551,307~-574,171  X^— -123,610  X/. 

The  maximum   ordinate  is  at  the  position  X^=:= — 2-3225,  or  33^=4-0808,  with 
maximum  number  of  branches  ?/^,=  9-435. 
(c.)  Cubic  from  (Ivi.) : 

Y,,=  1-590,413--987,694X^--1 37,641  X/+-016,605  X/. 

In  all  cases  X^  and  Y^^  are  measured  from  the  mean  position  and  the  mean  number 
of  branches,  i.e.,  6-403,315  and  7-216,851  respectively. 

The  following  table  contains  the  calculated  and  observed  results  : — 


Table  VIII. — Mean  Number  of  Branches  to  each  Whorl  in  Equisetum. 


Position. 

Regression  line. 

Regression 
parabola. 

Regression 
cubic. 

Observed. 

Regression  cubic 
without  first  whorl. 

1 

10-758 

8-262 

7-506 

7-619 

[8-207] 

2 

10-103 

8-900 

9-070 

9-294 

8-929 

3 

9-447 

9-291 

9-920 

9-627 

9-869 

4 

8-792 

9-434 

10-156 

9-730 

10-161 

5 

8-137 

9-330 

9-876 

9-643 

9-911 

6 

7-481 

8-980 

9-182 

9-427 

9-224 

7 

6-826 

8-382 

8-172 

8-732 

8-205 

8 

6-170 

7-536 

6-947 

7-297 

6-962 

9 

5-515 

6-444 

5-605 

5-555 

5-599 

10 

4-859 

5-104 

4-247 

3-964 

4-223 

11 

4-204 

3-517 

2-971 

2-443 

2-939 

12 

3-549 

1-683 

1-879 

1-866 

1-854 

13 

2-893 

-0-399 

1-069 

1-462 

1-072 

14 

2-238 

-2-727 

0-641 

1-333 

0-700 

15 

1-582 

-5-303 

0-694 

1-250 

0-844 

16 

0-927 

-8-126 

1-328 

1-000 

1-610 

In  the  last  column  I  have  placed  the  results  of  re-working  the  whole  system, 
omitting  the  first  whorl  as  largely  influenced  by  the  ground  condition  at  the  foot  of 
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the  stem/"  The  improvement  of  fit  is  not  sufficiently  great  to  justify  a  publication  of 
all  the  constants  for  the  distribution  in  this  modified  case.  But  there  is  improvement 
for  the  higher  whorls,  which  are  so  few  in  number  as  to  be  wholly  insignificant  when 
compared  with  the  weight  of  the  first  few  low  whorls. 

It  will  be  noticed  at  once  that  the  line  and  the  parabola  (which  gives  at  the  top  of 
the  stem  negative  numbers !)  are  absolutely  unsuitable  for  representing  the  facts  of 
the  case.  The  cubic  is  better  and  certainly  gives  the  general  trend  of  the  observa- 
tions, but  in  this  our  last  illustration  we  have  clearly  reached  the  limit  of  material  to 
which  such  cubical  regression  can  be  satisfactorily  applied.    See  Diagram  V. 

(12.)  Quartic  Regression. 

It  seemed  of  some  interest  in  this  case  of  Equisetum  to  ascertain  whether  any  real 
improvement  in  description  would  be  reached  by  considering  the  quartic  regression 
curve.  I  briefly  indicate  the  theory  in  this  case  as  developed  from  the  general 
method  in  the  footnote,  p.  25.    We  shall  now  have 

Eliminating  and  h-^,  by  the  processes  familiar  to  us  from  the  case  of  cubical 
regression,  we  have 

YJcT,  =  r  (X,/<r.)  +  6,{(XVcr.)^-  v/^;  (X>.)- 1 } 
+  h, { (X,/cr.)3  - (X,/cr.)  -  v/A } 

+  6J(X>.)^-(A/v/)8;)  (X,/cr.)-y8,}  (Ixx.). 

Hence  as  before 

C=hh+h<l>^+Ki>6  y  (Ixxi.),. 

where  (f)2,  <^3,  and      are  given  as  before  by  (li.  and  liv.),  while 

<^5=A-/33-A   (Ixxii.), 

<^,=-{^o-M,-^MIs/W,   (Ixxiii.), 

ckMM.-^s'-MiV^i   (Ixxiv.), 

and 

^5  =  '^7^3/<^^^^>    ^^  =  vj(rz^   (Ixx  v.). 


Solving,  we  have 


*  ( 
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and 

^d^-^s"  VA-<^3'    !^  (Ixxvii). 

Substituting  in  (Ixx. ),  the  solution  is  completed.  The  advantage  of  this  form  is  that 
we  see  clearly  the  modifications  made  in  63  and  63  as  we  pass  from  cubical  to  quartic 
regression.  On  the  other  hand,  and  <j)^,  as  shown  by  (Ixxv.),  involve  the  7""  and 
8*  moments  of  the  iC-character.  These  are  not  only  very  laborious  to  calculate,  but, 
as  we  have  already  shown,  rule  very  untrustworthy. 

If  we  proceed  as  on  p,  26,  equation  (Ivii.),  we  find 

^3_^3_^^^_j_^^^_|_5^^  .  (Ixxviii.), 

Using  this  and  not  the  third  equation  of  (Ixxi.),  we  replace  (Ixxvi.)  by 

b^={cf,.^<j>^-cly./)^  U  </>2(<^2</>4-J>3')i   .  (Ixxix.). 

(^{^■2^.  —         —  e  (<^+<^5  —  <^3^6)  —  C{4>2^C,  —  <f>i^5) 

This  equation  for  b^,  only  involves  the  7'"  and  not  the  8*''  moment,  but  like  the 
corresponding  form  (Ix.)  suffers  from  being  a  ratio  of  small  quantities.  (Ixxvii.) 
completes  the  solution  as  before. 

(Ixxvii.)  and  (Ixxix.)  in  conjunction  give  us  a  necessary  condition  for  quartic 
regression.    We  can  indeed  now  write  the  whole  series  of  conditions  as  follows  : — 

Linear  regression  : 

Parabolic  regression  : 

Cubical  regression  : 

^■3_,.^_e7c^^_-(^,^^_ec^3)7{^.,(<^,^,^-<^32)]=0. 

Quartic  regression  : 

 (Ixxx,). 

We  now  have  a  third  possibility  :  we  can  get  rid  of  the  fourth  product  moment  0 
from  the  value  of  64  and  write  it : 
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While  this  value  of  does  not  suffer  like  (Ixxix.)  from  being  the  ratio  of  small 
quantities,  and  would  a  priori  appear  to  save  the  calculation  of  6,  yet  the  right  sign  of 
the  root  may  not  be  ovious  on  inspection,  so  that  an  actual  determination  of  6  to  find 
the  sign  of  may  after  all  be  needful.  If  (Ixxx.)  were  absolutely  satisfied,  (Ixxxi.), 
(Ixxix.)  and  (Ixxvi.)  would  lead  to  identical  results;  but  this  will  rarely  be  true  in 
practice.  In  any  of  the  three  cases  63  and  63  will  be  given  by  (Ixxviii.).  On  the 
whole,  I  consider  that  (Ixxxi.)  and  (Ixxvi.)  will  give  the  better  results,  and  probably 
the  former  the  best,  but  it  will  generally  require  as  much  arithmetic  as  the  latter. 


(13).  Illustration  E. — Calculation  of  the  Quartic  Regression  Curve  in  the  Case 

of  Equisetum  arvense. 

The  only  new  constants  required  are  : 

1/7  =  43,207-386,       whence  j85  =  l-144,882, 
1/8  =  507,649-540,  )8fi=20-463,633, 

(^5=3-425,069,  <^6  =  3-452,046, 

<^7  =  15-015.792. 

These  lead  us  to  : 


and 


M,-<f>s<k  ^  2-723,384,        ^fj^^A  =  i.2li,194, 

=  1-745,622. 


A^  = 


<^5>  <f>6' 

Our  successive  conditions  are  therefore  : 

^2_^2_. 222,596, 
^2_^2_-2/^^_ -069^266, 

^2_,,2_-3/^^_(^^^_,-^^)2/,^^(^^^^_^^2)^^.010  186, 
[B  {<j>^<f>^  —  <f>S^)  —  e  (<^4<^5  —  —  li  <^20fi  —  <^3<^5  )  1  ^ 


•007,200, 


whence  we  see  the  successive  approximations  to  the  fulfilment  of  the  conditions. 
Clearly  great  gains  arise  when  we  pass  from  linear  to  parabolic,  and  from  parabolic 
to  cubic  regression,  but  the  advance  is  not  so  conspicuous  when  we  pass  to  quartic 
regression. 

G 
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We  have  : — 

From  (Ixxvi.)  :  = -044,517,  and  Z>o= —  -648,122,  63=-171,260, 
From  (Ixxix.)  :  64=-151,842,  and  6o=  — '940,410,  ?>3= -041,981, 
From  (Ixxxi.)  :    6^=  "025,999,    and - -597,691,    63= -193,688. 

The  equations  to  the  three  corresponding  quartics  are  : 

(a).  Y,^=r724,611--913.208  X^--169,311  X/+-012,629  X/+-000,927  X/, 
(6).  Y,,  =  2-047,717--734,966  X;,--245,667  X/+-003,096  X/+-003,161  X/, 
(c).  Y,.=l-668,788--944,192  X^--156,137  X/H- -014,283  X/+ -000,541  X/. 

The  values  of  Y^^  and  X^  are  as  before  measured  from  the  means,  or  7-216,851  and 
6-403,315  respectively. 

The  values  of  the  observed  and  calculated  ordinates  are  given  in  Table  IX.,  and 
the  graph  of  the  results  in  the  lower  half  of  Diagram  V. 


Table  IX. — Mean  Number  of  Branches  to  Whorl  in  Equisetwn  deduced  from  Quartic 

Regression. 


Position. 

Quartic  (a). 

Quartic  (b). 

Quartic  (c). 

Observed. 

1 

7-731 

8-269 

7-637 

7.619 

2 

8-950 

8-662 

9-000 

9-294 

3 

9-715 

9-222 

9-800 

9-627 

4 

10-014 

9-674 

10  073 

9-730 

5 

9-858 

9-816 

9-866 

9-643 

6 

9-281 

9-521 

9-240 

9-427 

7 

8-339 

8-740 

8-270 

8-732 

8 

7-109 

7-498 

7-042 

7-297 

9 

5-692 

5-898 

5-656 

5-555 

10 

4-209 

4116 

4-225 

3-964 

11 

2-816 

2-407 

2-875 

2-443 

12 

1-651 

1100 

1-745 

1-866 

13 

0-930 

0-600 

0-987 

1-462 

14 

0-857 

1-389 

0-766 

1-333 

15 

1-665 

4-022 

1-259 

1-250 

16 

3-609 

9-133 

2-657 

1-000 

From  these  results  we  deduce  the  following  conclusions : — 

(i.)  That  the  use  of  a  quartic  instead  of  a  cubic  regression  curve  has  not  very 
markedly  bettered  the  fit.  The  failure  to  get  a  closer  fit  lies  largely  in  the  nature  of 
the  material.  The  number  of  plants  with  more  than  13  whorls  is  very  few,  and  their 
contribution  allows  little  weight  to  the  tail  of  the  regression  curve.    Further,  all  our 
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attempts  to  fit  a  smooth  regression  curve  show  that  the  observed  data  are  unduly 
flattened  at  the  top.  If  we  confine  ourselves  to  a  homogeneous  series  of  110  plants 
with  ten  whorls  apiece,  we  get  a  remarkably  good  fit.*  The  S-shape  of  the 
regression  line  as  indicated  in  both  cubic  and  quartic  does,  however,  appear  to  be 
characteristic  of  the  nature  of  the  plant,  and  I  take  it  that  more  ample  material 
would  allow  of  a  closer  analytical  description  by  a  simple  cubic.  I  doubt  whether  for 
practical  statistics  the  use  of  the  quartic  will  often  be  requisite. 

(ii.)  The  comparative  failure  of  the  quartic  (h)  shows  us  that  a  formula  like  (Ixxix.) 
is  of  small  service.  This  corresponds  fully  to  our  experience  in  the  use  of  (Ix.)  in  the 
case  of  the  cubic.  In  both  cases  we  get  rid  of  a  high  moment  by  making  a  certain 
constant  the  ratio  of  two  small  quantities,  and  experience  shows  us  that  the  result  is 
unsatisfactory.  It  is  accordingly  preferable  to  use  formulae  involving  high  moments 
of  one  variable  in  preference  to  those  with  a  ratio  of  small  quantities. 

(iii.)  The  quartic  (c)  appears  as  good,  if  not  slightly  better,  than  quartic  (a).  In 
(c)  we  have  got  rid  of  a  high  product  moment,  6,  by  supposing  the  quartic  condition 
(Ixxx.)  rigidly  fulfilled.  This  of  course  is  not  the  case.  It  is  clear  that  product 
moments  like  6  of  the  S**"  order  are  far  from  advantageous,  and  this  is  the  same  principle 
which  was  in  evidence  when  we  found  (Ixv.)  giving  better  results  than  (Ixiv.)  for 
parabolic  regression.  Hence  we  must  further  conclude  that  the  use  of  third,  fourth  or 
fifth  product  moments  is  disadvantageous  as  compared  respectively  with  fifth  to  eighth 
moments  of  one  variable.  Or,  a  moment  two  degrees  higher  is  preferable  to  a  product 
moment  in  calculating  correlation  values.  This  is,  I  think,  consonant  with  our 
knowledge  of  the  relative  magnitude  of  the  probable  errors  in  the  two  cases. 

(14.)  General  Conclusions. 

(i.)  The  present  paper  provides  us  with  a  general  method  of  dealing  with  the 
regression  line  and  the  variability  of  arrays  in  the  case  of  skew  correlation,  without 
any  assumption  as  to  the  analytical  form  of  the  skew  correlation  surface. 

(ii.)  It  provides  a  nomenclature  and  classification  of  the  types  of  array  variability 
which  may  be  of  service. 

Arrays  are  either  homoclitic  or  heteroclific,  according  as  their  skewnesses  are  of 
equal  magnitude  or  not.  Arrays  are  further  honioscedastic  or  hcteroscedastic, 
according  as  their  standard  deviations  are  alike  or  different.  Skew  arrays  are  termed 
allociirtic;  if  arrays  are  symmetrical  about  their  mean,  tliey  are  isocurtic. 

A  heteroclitic  system  of  arrays  may  be  nomic  or  anomic,  according  as  the  skewness 
of  the  arrays  changes  continuously  or  irregularly  with  the  position  of  the  array. 

A  heteroscedastic  system  of  arrays  is  also  either  nomic  or  anomic,  according  as  the 
standard  deviation  of  the  arrays  changes   continuously  or  irregularly  with  the 

♦  'Roy.  Soc.  Proc.,'  vol.  71,  p.  308. 
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position  of  the  arrays.  Anomic  heteroclisy  and  anomic  heteroscedasticity  probably  only 
signify  that  our  material  is  either  heterogeneous  or  too  sparse  to  free  us  from  the 
large  errors  of  random  sampling  in  the  extreme  arrays.  Still  the  terms  will  be 
found  of  use  in  describing  the  actual  data. 

The  curve  in  which  the  skewness  of  the  array  is  plotted  to  its  position  is  termed 
the  clitic  curve ;  the  curve  in  which  the  ratio  of  the  standard  deviation  of  the  array 
to  the  standard  deviation  of  the  character  in  the  population  at  large  is  plotted  to 
position  is  termed  a  scedastic  curve. 

(iii.)  The  types  of  regression  have  been  classified  into  linear,  paraholic,  cubic  and 
quartic.  For  most  practical  purposes  the  first  three  suffice.  Necessary  criteria 
have  been  given  for  each  case.  But  as  in  the  case  of  the  skew  frequency  of  one 
character,  an  indefinite  number  of  conditions  ought  theoretically  to  be  fulfilled. 
Practically  in  dealing  with  frequency,  no  criteria  are  absolutely  fulfilled,  and  the 
probable  errors  of  the  expressions  used  become  unmanageable  as  we  ascend  in  the 
scale.  We  must  therefore  be  content  to  estimate  the  degree  of  approximation  with 
which  one  or  two  necessary  criteria  are  satisfied. 

The  fundamental  test  of  deviation  from  the  familiar  form  of  linear  resfression  is  the 
inequality  of  the  correlation  coefficient  r  and  the  newly  introduced  correlation 
ratio  Tj.  The  probable  error  of  this  latter  is  determined.  It  is  shown  that 
—  7)^  is  the  mean  standard  deviation  of  a  system  of  arrays  in  skew  correlation. 
The  ease  with  which  17  can  be  calculated  suggests  that  in  many  cases  it  should 
accompany,  if  not  replace  the  determination  of  the  correlation  coefficient. 

In  the  determination  of  the  constants  of  the  regression  curve  we  must  use 
moments  and  product  moments.  The  limitations  to  the  order  of  the  curve  used 
depend  :  (a)  on  the  labour  of  the  arithmetic,  (6)  on  the  increasing  probable  errors  of 
the  higher  moments  and  product  moments.  For  these  reasons  it  seems  idle  to  propose 
going  beyond  the  6"'  to  8"'  moments,  or  the  3''^  to  5"*  product-moments.  Practical 
experience  suggests  that  little  is  to  be  gained  by  using  moments  beyond  the  6*",  or 
product  moments  beyond  the  3''''.  A  quartic  regression  curve  may  be  useful 
occasionally,  but  it  has  yet  to  justify  its  necessity.  As  our  object  is  not  to  repro- 
duce the  given  data,  but  to  provide  a  graduation  for  them,  which  smooths  down  the 
errors  of  random  sampling,  we  believe  that  any  legitimate  and  practical  theory  must 
discard  the  high  moments  and  high  product  moments  with  which  Thiele  and  Lipps 
propose  to  deal. 

(iv.)  There  is  one  point  to  which  reference  ought  to  be  made.  Some  reader  may 
enquire  why  the  method  of  my  paper  on  curving  fitting^"  should  not  be  applied 
to  these  regression  curves  in  general,  as  we  have  in  practice  once  or  twice 
already  applied  it.  It  would  seem  that  that  method  is  the  easier,  involving  in  the 
case  of  the  quartic  only  quantities  analogous  to  our  r,  e,  C  and  6.    The  answer  is 

*  "On  the  Systematic  Fittings  of  Curves  to  Observations  a  d  Measurements."  ' Biometrika,' 
vol.  I.,  pp.  265-303,  and  vol.  H.,  pp.  1-23,  especially  the  latter,  pp.  11-15. 
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straightforward :  that  process  supposes  every  y^^  to  have  equal  weight,  or  n^^  to  be 
the  same  for  each  array.  Hence  the  higher  moments  of  the  x-character,  which  are 
really  involved,  can  be  written  down  without  calculation  once  and  for  all/'*  The 
complexity  of  our  present  investigation  arises  from  the  introduction  of  the  weighting 
into  the  calculation  of  the  moments  of  the  x'-character,  as  well  as  into  that  of  the 
product  moments  r,  e,  ^,  6.  Our  results  therefore,  although  they  might  not  look  so 
good  on  a  graph  of  the  regression  curve,  would  be  markedly  better,  if  due  weight 
were  given  to  the  frequency  of  each  array.  The  difference  of  the  two  conceptions  is 
comparable  to  the  determination  of  the  regression  on  the  one  hand  from  the 
correlation  coefficient,  and  on  the  other  from  merely  striking  a  line  through  the 
plotted  means  of  the  arrays.  The  method  of  moments  in  the  present  case,  if  we 
except  the  use  of  17,  is  identical  with  that  of  fitting  a  curve  to  a  continuum  in  space 
by  the  method  of  least  squares. 

(v. )  No  stress  whatever  is  laid  on  the  actual  instances  here  selected  for  illustration  of 
the  methods  of  this  paper.  I  have  merely  chosen  out  of  available  material  cases  in 
which  I  had  come  across  skew  regression  of  various  types.    Thus  we  find  : — ■ 

(«.)  The  correlation  of  the  number  of  branches  and  position  of  the  whorl  in 
Asperula  odorata  is  practically  parabolic,  homoscedastic  and  of  nomic  heteroclisy. 

(b.)  The  correlation  between  auricular  height  of  head  and  age  in  girls  is  cubical, 
of  nomic  heteroscedasticity  and  of  anomic  heteroclisy.  It  is  probably  really  a  case 
of  isocurtosis. 

(c.)  The  correlation  of  size  of  cell  and  size  of  body  in  Daphnia  magna,  allowing 
for  the  irregularities  produced  by  the  ecdyses,  is  parabolic  or  cubic,  of  nomic 
heteroscedasticity,  and  probably,  but  for  the  above-mentioned  irregularities,  of 
isocurtic  homoclisy. 

(d.)  The  correlation  of  the  number  of  branches  and  position  of  the  whorl  in 
Equisetum  arvense  is  cubical  or  possibly  even  quartic,  of  markedly  nomic  hetero- 
scedasticity and  markedly  nomic  heteroclisy. 

It  is  not  impossible  that  slips  have  occurred  in  the  lengthy  arithmetic  involved,  but 
every  important  piece  of  work  has  been  done  independently  tMace,  once  by  Dr.  Alice 
Lee,  whom  I  have  most  heartily  to  thank  for  her  unwearying  assistance,  and  once 
by  myself  To  preserve  uniformity  of  working,  the  constants  have  in  each  case 
been  carried  to  six  figures.  This  involves  little  or  no  additional  trouble,  using  as  we 
do  mechanical  calculators.  The  final  results  are  of  course  of  no  value  beyond  their 
probable  errors,  which  will  be  in  the  second  or  third  place  of  figures.  No  doubt  I 
shall  be  told  that  there  is  a  show  of  accuracy  in  the  number  of  decimal  figures 
retained,  which  does  not  really  exist.  It  does  not  exist  (and  I  am  as  fully  conscious 
of  its  non-existance  as  any  would-be  critic)  so  far  as  our  results  fit  the  actual 
population,  of  which  we  have  but  a  random  sample.  The  figures,  however,  are  of 
importance,  as  far  as  testing  accuracy  of  fit  of  result  to  actual  sample  goes.  The 

*  '  Biometrika,'  vol.  II.,  p.  12. 
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cubic  or  quartic  curves  may  have  coefficients  insensible  before  the  third  or  fourth 
fio-ure  of  decimals,  and  these  coefficients  have  to  be  multiplied  occasionally  by 
abscissae  of  the  third  or  fourth  powers  of  7  to  9.  Hence  to  get  ordinates  true,  as 
far  as  the  sample  goes,  to  the  second  or  third  figure,  we  require  to  work  to  a  fairly 
high  number  of  figures.  There  is  no  magic  in  six  figures,  four  or  five  would  probably 
siitisfy  another  worker,  but  they  are  easily  read  off  the  calculator  we  use,  and  if  the 
constants  had  been  tabled  only  to  four  or  five,  no  reader  would  have  been  able  to 
agree  exactly,  if  he  wished  to  test  any  of  our  results,  even  to  three  figures,  with  the 
final  ordinates. 


DIAGRAM  I.    SKEW  CORRELATION  IN  ASPERULA  OOORATA. 
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DIAGRAM  V.    SKEW  CORRELATION  BETWEEN  BRANCHES  AND  POSITION  OF  WHORL  IN  EQUISETUM 
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XV.    A  MATHEMATICAL    THEORY   OF    RANDOM    MIGRATION.  By 
Karl  Pearson,  F.R.S.,  with  the  assistance  of  John  Blakeman,  M.Sc. 

(1)  Introductory.  In  deahng  with  any  natural  phenomenon, — especially  one 
of  a  vital  nature,  with  all  the  complexity  of  living  organisms  in  type  and  habit, — 
the  mathematician  has  to  simplify  the  conditions  until  they  reach  the  attenuated 
character  which  lies  within  the  power  of  his  analysis'".  The  problem  of  migration 
is  one  which  is  largely  statistical,  but  it  involves  at  the  same  time  a  close  study 
of  geographical  and  geological  conditions,  and  of  food  and  shelter  supply  peculiar 
to  each  species.  Some  years  ago  the  late  Professor  Weldon  started  an  extensive 
study  as  to  the  distribution  of  various  species  and  local  races  of  land  snails,  but 
he  was  struck  by  the  absence  in  several  cases  of  any  definite  change  of  environ- 
ment at  the  boundaries  of  the  distribution  of  a  definite  race.  It  occurred  to  me 
in  thinking  over  the  matter  that  such  boundaries,  where  they  exist,  may  possibly 
not  be  permanent.  To  take  a  purely  hypothetical  illustration  :  A  species  is  pushed 
back  to  a  certain  limit  by  a  change  of  environmental  conditions — say,  an  ice  age. 
Does  it  follow  that  if  the  environment  again  becomes  favourable,  that  it  will 
7'apicUy  occupy  possible  country  ?  What  is  the  rate  of  infiltration  of  a  species 
into  a  possible  habitat  ?  It  depends,  of  course,  on  a  whole  series  of  most  complex 
conditions,  the  rate  of  locomotion,  the  channels  of  communication,  the  distribution 
of  food  areas  and  breeding  grounds  in  the  new  country,  and  the  connecting  links 
between  all  these.  Every  detail  must  be  studied  by  the  field  naturalist  in  relation 
to  each  species.  All  the  mathematician  can  do  is  to  make  an  idealised  system, 
which  may  be  dangerous,  if  applied  dogmatically  to  any  particular  case,  but  which  can 
hardly  fail  to  be  suggestive,  if  it  be  treated  within  the  limits  of  reasonable  application. 
The  idealised  system  which  I  proposed  to  myself  was  of  the  following  kind  : 

(i)  Breeding  grounds  and  food  supply  are  supposed  to  have  an  average  uniform 
distribution  over  the  district  under  consideration.  There  is  to  be  no  special  following 
of  river  beds  or  forest  tracks. 

*  This  is  of  course  a  perfectly  familiar  process  to  every  mathematical  physicist,  but  its  unfamiliarity 
leads  the  biologist  to  suspect  or  even  discard  mathematical  reasoning,  instead  of  testing  the  result 
as  the  physicist  does  by  experiment  and  observation. 
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(ii)  The  species  scattering  from  a  centre  is  supposed  to  distribute  itself 
uniformly  in  all  directions.  The  average  distance  through  which  an  individual  of 
the  species  moves  from  habitat  to  habitat  will  be  spoken  of  as  a  "flight,"  and  there 
may  be  n  such  "flights"  from  locus  of  origin  to  breeding  ground,  or  again  from 
breeding  ground  to  breeding  ground,  if  the  species  reproduces  more  than  once. 
A  flight  is  to  be  distinguished  from  a  "flitter,"  a  mere  two  and  fro  motion  associated 
with  the  quest  for  food  or  mate  in  the  neighbourhood  of  the  habitat. 

(iii)  Now  taking  a  centre,  reduced  in  the  idealised  system  to  a  point,  what 
would  be  the  distribution  after  n  random  flights  of  iV  individuals  departing  from 
this  centre  ?  This  is  the  Jirst  problem.  I  will  call  it  the  Fundamental  Problem  of 
Random  Migration. 

(iv)  Supposing  the  first  problem  solved,  we  have  now  to  distribute  such  points 
over  an  area  bounded  by  any  contour,  and  mai-k  the  distribution  on  both  sides 
of  the  contour  after  any  number  of  breeding  seasons.  The  shape  of  the  contour  and 
the  number  of  seasons  dealt  with  provide  a  series  of  problems  which  may  be  spoken 
of  as  Secondary  Problems  of  Migration. 

A  little  consideration  of  the  Fundamental  Problem  showed  me  that  it  presented 
considerable  analytical  difficulties,  and  I  was  by  no  means  clear  that  the  series  of 
hypotheses  adopted  would  be  sufficiently  close  to  the  natural  conditions  of  any 
species  to  repay  the  labour  involved  in  the  investigation.  At  this  stage  the  matter 
rested,  until  last  year  Major  Ross  put  before  me  the  same  problem  as  being  of 
essential  importance  for  the  infiltration  of  mosquitoes  into  cleared  areas,  and  asked 
me  if  I  could  not  provide  the  statistical  solution  of  it.  He  considered  that  we 
might  treat  a  district  as  approximately  "  equi-swampous,"  and  thus  my  conditions 
(i),  (ii)  above  could  be  applied  to  obtain  at  any  rate  a  first  approximation  to 
the  solution. 

Starting  on  the  problem  again  I  obtained  the  solution  for  the  distribution  after 
two  flights,  an  integral  expressing  the  distribution  after  three  flights,  which  I 
carelessly  failed  to  see  could  be  at  once  reduced  to  an  elliptic  integral,  and  the 
general  functional  relation  between  the  distribution  after  successive  flights.  At  this 
point  I  failed  to  make  further  progress,  and  under  the  heading  of  "The  Problem 
of  the  Random  Walk  "  asked  for  the  aid  of  fellow-mathematicians  in  Nature*.  The 
reply  to  my  appeal  was  threefold.  Mr  Geoffrey  T.  Bennett  sent  me  in  terms  of 
elliptic  integrals  the  solution  for  three  flights.  Lord  Rayleigh  drew  my  attention 
to  the  fact  that  the  "problem  of  the  random  walk"  where  the  number  of  flights 
is  very  great  becomes  identical  with  a  problem  in  tlie  combination  of  sound  ampli- 
tudes in  the  case  of  notes  of  the  same  period,  which  he  has  dealt  with  in  several 
papersf.    Lastly  Professor  J.  C.  Kluyver  presented  a  paper  to  the  Royal  Academy 

*  July  27th,  1905. 

+  Phil.  Mag.,  August,  1880,  p.  75;  February,  1899,  p.  246. 
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of  Sciences  of  Amsterdam,  entitled  "A  local  probability  problem."""  Professor 
Kluyver  obtains  the  general  solution  in  terms  of  the  integral  of  a  product  of 
Bessel's  functions  of  the  zero  and  first  orders.  He  deduces  Rayleigh's  solution  for 
n  large,  he  shows  that  the  Bessel  function  integral  represents  a  series  of  different 
analytic  functions  in  different  intervals,  and  proves  a  number  of  special  problems 
of  very  considerable  interest.    Referring  to  his  general  solution,  he  writes,  however  : 

"  From  this  result  we  infer  that  the  probability  sought  for  is  of  a  rather  intricate 
character.  The  n  +  1  functions  J  are  oscillating  functions,  and  have  their  signs 
altering  in  an  irregular  manner  as  the  variable  u  increases.  Hence  even  an 
approximation  of  the  integral  is  not  easily  found,  and  as  a  solution  of  Pearson's 
problem  it  is  little  apt  to  meet  the  requirements  of  the  proposer."  t 

Kluyver's  solution  is  of  extreme  suggestiveness  for  the  analytical  theory  of 
discontinuous  functions.  In  the  endeavour  to  express  it  in  a  form  suited  to  my 
special  purposes  I  have  come  across  a  long  series  of  Bessel  function  properties, 
some  at  least  of  which  seem  to  me  novel,  but  have  unfortunately  no  bearing 
on  the  problem  of  migration.  If  we  turn  to  Rayleigh's  solution  for  n  large,  I 
must  confess  at  once  to  being  unconvinced  of  the  adequacy  of  the  proofs  used  to 
deduce  it,  especially  that  in  the  Theory  of  Sound]..  Kluyver's  proof  of  Rayleigh's 
solution  §  appears  to  me  to  also  require  much  strengthening,  and  in  neither  case  do 
we  have  any  practical  measure  of  what  the  number  of  flights  must  be  before  we 
have  in  practice  a  reasonable  accordance  between  the  discontinuous  Bessel's  function 
integral  expression  and  the  Rayleigh  solution  of  Gaussian  frequency  type. 

After  a  good  many  failures  I  have  succeeded  in  obtaining  a  solution  in  series 
of  the  Bessel  function  integral,  but  this  not  of  a  character  to  be  of  service  for 
frequent  arithmetical  calculations.  It  serves,  however,  to  test  the  approximation 
of  the  Rayleigh  solution  and  the  accuracy  of  the  solutions  for  few  flights  obtained 
by  other  processes.  At  this  stage  I  realised  that  the  functional  equation  between 
the  distributions  for  n  and  w  +  1  flights  could  be  solved  graphically,  and  that  starting 
with  the  known  distributions  for  n  =  2  or  7i  =  3,  we  could  by  very  great  labour, 
but  absolutely  straightforward  graphical  work  and  the  use  of  mechanical  integrators, 
build  up  in  succession  the  solutions  for  n  =  4,  5,  6,  7,  etc.  I  proposed  that  this 
process  should  be  continued  until  the  graphically  found  distribution  coincided  with 
the  plotted  values  obtained  from  the  above  solution  in  series.  This  was  achieved 
for  n  =  7.  For  7i  =  6  and  n  =  7,  the  solution  in  series  approaches  to  the  Rayleigh 
solution,  with  which  for  all  practical  purposes  it  may  be  asserted  to  coincide  for 
n=10.  We  have  thus  reached  a  continuous  graphical  illustration  of  the  transition 
of  a  series  of  discontinuous  and,  in  many  respects,  remarkable  analytical  functions, 
step  by  step  with  the  increase  of  n  into  a  normal  curve  of  errors.    The  relation- 

*  Koninklijke  Akademie  van  Vetenschappen  te  Amsterdam.  Proceedings,  Oct.  25,  1905,  pp.  341 — 50. 
t  loc.  cit.  p.  343.  +  2nd  Edition,  §  42  a.  §  Kluyver,  loc.  cit.  p.  345. 
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ship  is  a  notewoi'thy  one,  and  not  without  suggestion  for  other  branches  of 
investigation. 

The  exact  method  of  graphical  solution  will  be  described  later  ;  the  whole  labour  of 
it,  involving  many  weeks'  work,  was  due  to  my  assistant,  Mr  John  Blakeman,  M.Se. 

(2)  General  Analytical  Solution  of  the  Fundamental  Problem.  Let  the  origin 
be  taken  at  the  centre  of  dispersion  and  r  be  the  distance  of  any  small  elementary 
area  a  from  the  centre  of  dispersion.    Let  .  a  be  the  frequency  of  individuals  on 

a  after  the  nth.  flight,  and  (^„_^,(r-)a  their  frequency  on  the  same  element  after 
the  (w+  l)th  flight.  Let  /  be  the  length  of  the  flight.  Then  only  those  individuals 
who  were  on  a  circle  of  radius  I  round  the  centre  of  a  after  the  nth  flight  can  reach  a 
with  the  (n+l)th  flight,  and  only  those  individuals  of  these  who  take  their  flight 
in  one  definite  direction.  Let  O  be  the  centre  of  dispersion,  C  the  centre  of  a, 
P  a  point  on  the  circle  of  radius  /  round  C,  L  PCO  =  6,  then  the  frequency  per 
unit  area  at  P  is  (/)„(r'-+ —  27'/ cos  ^),  and  the  amount  which  goes  in  directions 
between  6  and  d  +  B6  is  dO/'Zu.  Hence  the  frequency  per  unit  area  at  C  after  the 
{n  +  1  )th  flight  is  given  by  : 

^n.Ar-)  =  ~  l\{r'  + 1' -2rl  cos  e)d0   (i). 

This  is  the  equation,  which  I  shall  speak  of  as  the  general  functional  relation  between 
the  densities  at  successive  flights.    Now  assume  :      (r")  =  (7,,  Jg  {ur),  where  C„  is  any 
undetermined  function  of  n,  I  and  n,  and  ?t  is  at  present  an  undetermined  variable. 
Then  by  Neumann's  Theorem  ''■ : 

  00 

Jo  (w  Jr-  +  1'-  2rl  cos  6)  =     (ur)  J,  (td)  +  2tJt  (ur)  Jt{ul)  cos  tO. 

1   

Hence  :  27r  )  o  ^    ~  ^^"^  ^^^''-^  '"^^  ^^^^^ 

=  Cn+,J,{ur),  by  (i). 

Therefore  Cn_^-^  =  Jo{ul)  C^. 

It  follows  that  C,,  =  D  {J^{ul)Y,  where  D  may  be  any  function  of  I,  but  not  of  n. 

Thus  we  have  :  {r)  =  D  J,  {ur)  [J,  (w/)}", 

where  we  may  sum  for  any  or  all  values  of  it. 

Now  when  w  =  1 ,  {r-)  must  be  zero,  for  all  values  of  r  except  r  =  Z  to  /  +  r,  and  it 
then  equals  Nj{2TTlT),  r  being  very  small  and  N  the  total  number  issuing  from 
the  centre  of  dispersion.    We  know,  however,  that  t : 

du  1^  upf{p)  J„  {up)  J,,  {ur)  dp  =f{r),  if  q<r  <p  ; 

=  0,      if  r  >p  or  <  q. 

*  C.  Neumann,  Theorie  der  Besselschen  Functionen,  S.  65. 
t  Gray  and  Mathews,  Treatise  on  BesseVs  Functions,  p.  80. 
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N 

Now  take  n  =  0,  q  =  I,   2^  =  ^  +  t  ^^^^     ^f^^  ~  ^rdr  ' 

then  we  have  :  ul  (ul)     (ur)  tcIu  = 

or,  (j),  {i^)=  ^     uJ,  {ul)  J,  {ur)  du   (ii). 

ZttJ  0 

This  determines  the  form  of  D  and  the  summation  of  u  ;  for,  if  we  take 

Mr')  =  f  riiJ.{ur){J,{ul)fdu   (iii). 

we  satisfy  the  general  functional  relation  (i)  and  further  the  initial  equation  (ii). 

Let  Pn  {r)  be  the  probability  that  an  individual  after  n  flights  will  be  a  distance  r 
or  less  from  the  centre  of  dispersion.    Then  clearly 

P,,{r)  =  27rjjdr<f>„^,{7'') 

=  Nj  rdrjuJ,  {ur)  {J,  {ul)}''  du. 


=  N\   r  j;  {ur)  {J,  {ul)Ydu, 


0 


or  li  v  =  ur:  =N  \   J^{v)  jJ—\  dv   (iv). 


0 


(iv)  is  Kluyver's  fundamental  solution,  which  he  reaches  by  a  very  different 
and  more  general  analysis,  (iii)  is  the  form  of  it  which  best  suits  my  present 
investigation. 

(3)  On  an  expansion  in  series  of  the  expression  for  {r'^).  By  straight- 
forward but  somewhat  laborious  multiplication  it  can  be  shown  that : 

[J,  {2jy)  er  =  1  -       -  >f  +  ^-^^^ig^  f 


{50n-57)n  ^     (1892  -  2125n  + 270n')  ^ 
1800      ^  103,680 


Gray  and  Mathews,  loc.  cit.  p.  13. 
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Hence  putting  2  s/// =  2;, 

(50n- 57)^1   2"    _  (1892-2125n  +  270n-)n 

1800       1024  103,680  4096 

=  e~      {1  —  a^z^  —  a,.,z^  —  a^z^  —  a^^z""  —  a^.z'-  —  etc.}, 

let  us  write,  for  brevity.    The  as  are  then  known  coefficients. 
Now  by  (iii) 

If" 

<l>n{^)  =  ^j^  U  J.  {U7')  [J,  {UI)Y  du. 

But  we  know  that*  : 

ue  *  {u)  )  clu  =  '   

0 

Write  :  \nl-  =  &'   


Thus:  r  ue-'^'''''"j,{iLr)du  =  \e-^~'''"l''"   

jo  o" 

Differentiate  (viii)  s  times  with  regard  to  cr^ : 

Hence,  if  y8=  -2a-/'r,  we  have: 

J  0 


2'*+'  d'  /I 
1 


2^9+1 

We  have  therefore 


-  —       ^2s'  say. 


^    ^    '      2770-'  277  ,  =  2  ^ 


S  +  2 


where  it  remains  to  evaluate  i,,  =  e^'^ 


We  find 


8  \crj  \         a-'     4:  a' 


16  Vo-/  \         cr'     4  cr'     8  cr' 

i  /_  \  //2<r  24_48-,+  l8--2-  +  — - 
32  \cr/  \  o-"     16  a 

*  Gray  and  Mathews,  loc.  cit.  Eqn.  (162),  p.  78 
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1  /7^V-     -2/2^-2  /          ^^^r'  r'  25         1  r 


-  -Ws  (-o-.eo      ,3.0  ^  300  ^. .  f  ^  - 1  (r)" .  i 

Remembering-  that  by  (vii)  r-  =  2ayn,  we  have  from  (ix) 

(,.)  =       ,  e- i""'  1 1  -  ^  (2  -  2  ^  +  1  ^,  (6  -  9  ^  +  I     -  1 

^  ^  '    2TTcr'  y       4n\         cr     A  cr  ]     9«-  \         cr'     4  cr'  8 

+  6»rn(24-48^.+  18^.-2'^  +  lr: 
192?i    \  cr  cr        cr      16  o" 

+  SOnjj^ (120-300        150  n-25  4  +      4  -  4 
1800n   \  o-  a-         cr      16  cr'     32  o-  / 

1892-2125«  +  270nV,2o_2i60^;+  1350^.-300^.+  ?|5!!_9r!  ,    1  r 


103,68072'          \                                cr'          o-"       8  o"' 
—  etc.  j   (x). 

This  is  the  general  expansion  for  the  distribution  of  the  individuals  emerging 
from  a  centre  of  dispersion  after  n  random  flights.    Clearly  if  we  want  to  go  as 

far  as  ^  we  must  retain  terms  up  to  [r^ja^Y'^,  and  the  convergence  is  small  for  7i 

small.  Thus  for  the  first  two  or  three  flights,  (x)  as  far  as  I  have  calculated  the 
terms  gives  poor  results,  even  if  they  are  notwithstanding  better  than  the  Rayleigh 
solution.  The  arithmetical  work  required  to  calculate  the  ordinates  is  also  severe. 
If  we  put  w  =  00  ,  we  have 

^-^'"^  =  ^^'   (^^)' 

Lord  Rayleigh's  expression.  Now  (r  =  \^'nl',  hence  unless  I  becomes  indefinitely  small 
as  n  becomes  indefinitely  large  the  population  becomes  widely  scattered.  If  the 
single  flight  I  be  very  small,  but  the  total  flight  nl  be  finite,  then  ^nV'  tends  to 
become  vanishingly  small,  or  the  population  remains  close  to  the  centre  of  dispersion. 
This  is  really  the  "flitter"  as  distinct  from  the  flight. 

Examining  the  solution  found  it  is  clear  that  it  may  be  looked  upon  as  the 
sum  of  products  of  two  factors,  one  series  of  factors  not  involving  vjar  but  only  n 
and  the  other  not  involving  n  but  only  rlcr.    Thus  we  may  write 

=  N       W„  +     W.,  +  fjOJ,  +  V^(Ji^  +...), 

where 
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etc  (xii). 

The  tu-functions  form  a  series  of  such  special  interest  that  a  few  of  their 
remarkable  properties  will  be  stated  in  the  next  section. 

(4)    Properties  of  the  oi-f mictions. 

Let  us  consider  the  pth  moment  round  the  origin  of  the  2.sth  oj-function  taken 
over  all  plane  space.    We  will  denote  it  by  in^  o^.  Then 


0  Jo 


=  277      co._,r^+'dr   (xiii). 


0 


40-2 

and  hj  /5=  —2&^]r'^  we  have  d^  =  -^  dr. 

Hence  writing  p)  =  2q  we  find 

ni,,,_,=  {-ir^-^{2,.y  [^e^")d^   (xv). 

Integrate  by  parts  and  we  have 


The  part  in  curled  brackets  vanishes  at  the  limits  and  thus 


m, 


....  =  (  -  1)^"^-^  (2crOn^' "  ^  "  ^^1-0  G 


■  =ni,g,,s-,{s-q~l). 
Repeating  this  process  we  find 

wi25,2.  =  (-^-l-(i)  {s-2-q)  {s-S-q)  ...  {-q) 


x{-iy-'x{2a-yx  I  " l3-''-'e"^d^  (xvi). 


The  integral  is  finite  and  known  ;  hence  if  q  be  less  than  s  we  find  for  integer 
values 

m,^,.^  =  0,  '     q<s   (xvii). 

Now  consider  o).,g  as  made  up  of  two  parts, 

1 


2tt&' 


e  XX.,   (xviii). 
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Then  it  is  clear  that  X"-q>  '^^  1  ^^^^  than  s,  consists  of  powers  of  less  than 
s,  and  therefore 

Accordingly  a  remarkable  property  holds  for  the  ;i(-function  part  of  the  oj-function, 
namely,  if       ^^^^  Xn'  ^®  ^^^^  QncYi  functions,  then  it  follows  that 

I    e~^^'^^'^^XnXn''^^^^'^'~^^  '^^  1  ^"^^  ^  ^®  different,   (xix). 

Returning  to  (xvi),  let  us  put  q=^8,  then 

J  -0 


^^""2  ^-  (  -  1  )M         e  "  dx, 
2  jo 


or,  m,,,,  =  (-l)^c7--2^(|s)^   (xx). 

Let  us  now  consider  the  integral  over  the  plane 


Except  for  the  last  term  in  X2sf  it  will  consist  of  a  number  of  terms  having  for 
factors  ^2^,25  with  q<s  and  these  all  vanish.    The  last  term  in  x^s  is 

1  /r^'' 


and  accordingly 

7-    9  7      27r(-l)*  1  ^^.j 

I  =  27r\    co,,X2srdr=  1~- —  \  oi,,r''^'dr, 

JO  ^        cr   J  0 

or  by  (xx)  ^=(j^)'   (xxi). 

Hence  we  have  the  following  properties  : 

(a)  The  integral  all  over  the  plane  of  distribution  of  one  product  of  a 
^-function  into  an  tu-function  of  a  different  order  is  zero. 

(b)  The  integral  all  over  the  plane  of  distribution  of  the  product  of  a 
^-function  into  an  w-function  of  the  same  order  is,  if  25  be  the  order,  equal 

to  {\sy. 

These  properties  enable  us — as  in  the  case  of  Bessel's  functions  or  Legendre's 
functions — to  expand  any  function  symmetrical  round  a  centre  and  a  function 
only  of  the  square  of  the  distance  from  that  centre  in  w-functions. 


s=oo 


Thus  let  F{r')=  S  {b,,co,,), 

s=0 


2—2 
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multiply  by  ^..^  and  integrate  all  over  the  plane, 

277      F  {>■-)  x.^rdr  =  h,,27r      o^.^x-^rdr  =  h,,  {isf. 
Jo  Jo 

Hence  b,,  =  ^.^j^  F  {7'')  x^^rdr   (xxii). 


Now  X2.?  consists  of  an  algebraic  series  in      j  .    Thus  the  discovery  of  the 

value  of  the  integral  j    F{r')x:^grdr  depends  solely  on  the  determination  of  the 

Jo 

odd  moments  of  F  [r")  between  0  and  00 .  We  conclude  therefore  that  an 
expansion  in  w-functions  involves  merely  the  determination  of  moments,  such  as 
every  statistician  has  been  accustomed  for  years  to  calculate.  This  is  not  the 
proper  occasion  to  deal  fully  with  the  properties  of  the  w-functions,  nor  to 
generalise  them  for  odd  powers  of  r,  and  to  consider  the  convergency  of 
expansions  in  terms  of  w-functions.  They  will  be  discussed  on  another  occasion, 
but  the  present  writer  believes  that  they  will  be  found  of  not  inconsiderable 
service,  not  only  in  statistical  problems,  but  in  certain  physical  problems  where 
intensity  round  an  axis  diminishes  with  the  distance. 

(5)  Two  further  problems  are  of  service  for  the  theory  of  dispersion. 
Suppose 

F{i^)=s'{h.^co,,), 

«  =  o 

Integrate  over  the  plane  and  remember  that  Xo=l' 

277     F{r-)rdr=  S  2^  h.,,,o).,_,Xi:,rdr 
jo  s  =  o  jo 

=  60  (xxiii). 

Thus  the  first  coefficient  is  merely  the  total  volume  of  the  surface  z  =  F{')'^), 
taken  over  the  plane. 

Next  consider  the  second  moment 

r  00  .s  =  X  /■  00 

277      r-F{r-)  rdr  =  S  277      6.,,, .  oj.,, .  r\  rdr. 

jo  s  =  0        j  0 

Every  term  of  the  summation  vanishes  except  for  s=U  and  s=l,  and  the  left- 
hand  side  is  the  second  moment  of  the  function  about  the  axis  perpendicular 
to  the  plane  through  the  centre  =  volume  x  (swing-radius)'  =  \  x  K\  say.    Thus  : 

=  •lh,&'  +  h,{2-  4)  a-'  =  2  (6„  -  h.)  a\ 
or  h,  =  h,{l-^K'la-']  (xxiv). 
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Thus  far  no  choice  has  been  made  of  o-^  If  we  take  (r^  =  ^K',  we  have  6.^  =  0, 
or  if  a'  be  taken  half  the  square  of  the  swing-radius  about  the  axis  of  the 
sohd  of  revolution  z  =  F{r-),  that  is  if  <t  be  the  swing-radius  of  the  solid  about 
any  plane  through  its  axis,  then  the  second  term  in  the  expansion  of  F  (r^) 
in  ftj-functions  disappears. 

We  are  now  able,  I  think,  to  grasp  the  relation  of  the  Rayleigh  solution 
to  the  complete  solution  of  the  random  scatter  round  a  centre  of  dispersion. 
If  (f)n  {r')  be  the  function  giving  the  distribution  after  7i  flights,  then  {r') 
can  be  expanded  in  a  series  of  w-functions,  i.e. 

By  choosing  the  cr'  of  the  w-functions  =  ^K',  this  becomes,  since  6,,  the 
volume  =  N, 

<f>.  (r-)  =  ^  e-i^'^'^'  {1  +  h,x^  +...  +  h^^, +  ...]. 

Lord  Rayleigh's  solution  provides  the  first  term  of  this  series,  or  is  the 
correct  solution  to  two  terms  in  the  expansion  by  oj-functions.  It  possesses 
the  properties  (a)  that  its  volume  is  the  same  as  that  of  the  complete  solution, 
and  (6)  the  mean  square  deviation  from  the  centre  of  dispersion  is  the  same, 
i.e.  2&\  as  for  the  complete  solution. 

The  latter  depends  upon  the  fundamental  property  of  the  w-functions  that 

w.,^r'dr  =  0,  if  s  be  >  1. 

Jo 

The  expansion  in  w-functions  shows  us  at  once  that,  whatever  be  the  magnitude 
of  n,  the  mean  square  deviation  from  the  centre  of  dispersion  is  -J71I,  and  this 
gives  us  readily  a  rough  measure  of  the  range  of  habitat  of  any  species  for 
which  n  and  I  are  approximately  known. 

Another  point  may  be  noted  here  as  to  the  Rayleigh  solution.  That  solution 
is  the  best  fitting  Gaussian  error  surface  to  the  distribution,  i.e.  its  volume  and 
its  standard  deviation  are  the  same  as  those  of  the  actual  distribution,  whatever 
n  may  be.  If  we  take  the  section,  however,  of  the  distribution  through  its 
axis  the  standard  deviation  of  this  according  to  the  Rayleigh  solution  is  cr  =  J\nl, 
but  this  is  not  the  standard  deviation  of  the  section  of  the  actual  distribution, 
i.e.  the  Rayleigh  solution  does  not  give  the  best  fitting  normal  curve  to  the 
section.     It  gives  only  the  standard  deviation  corresponding  to  It  is  of 

some  value  to  note  what  are  the  standard  deviations  of  other  component 
(x).,^  terms. 

To  obtain  this  we  must  determine  the  area  and  even  moments  corresponding 
to  any        term.  Let 
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whence  integrating  by  parts  : 


-  00 


-0 


1     1/      1\/      3\/      5\      1  \, 


72770- 2  \     2/\     2/\     2/ ""2 

_J  1 

j2TT(r  2 


s=  or  >  1 . 

If  ''  =  0,  A,  =  -rl^j   (Xxvi). 


/■oo 

I  now  take:  ^,^„^=  (o.^r'^^dr 

'"Jo' 

and  find,  reducing  in  the  same  manner, 

Clearly:  .,j  =  27r  co.^gT-^'dr 

Jo  ' 

by  (xiii),  hence  :  =  277/13^  j^. 

Or,  w^^.i,,,  =  V 2^o-^^--^     ~ ~  ^)     ~ "  I)  •  •  •  (      + 1 

X  1  .  8  .  5  (2^;  -  1)   (xxviii). 

Thus  the  odd  moments  of  the  w.^  functions  are  known"". 
For  the  particular  case  when  2^=^  '■ 

if  k^g  be  the  swing-radius  round  the  axis  of  the  function  cd.^^.    Hence  by  (xxv) 

=  — ^  =  — —   (xxxiii). 

*  If  3;  =  7-j(r  the  following  finite  difference  and  ditferential  equations  are  fundamental  in  the 
theory  of  the  oj-functions  : 

w2(s+2)  -  (2s  +  3  -  |a;-)  Wo(,+,)  +  (s  +  1 )-      =  0  (xxix), 

<"2(s+i)  =     +  1)  '^.■s  +    (xxx), 

-d^'-i:''^-.)  d:"-'^^'^'^-^^-'  

But  the  fuller  treatment  of  the  oj-functions  must  be  deferred. 
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This  is  also  true  for  s  =  0,  as  well  as  any  integer  value.  It  follows  accordingly 
that  while  the  total  area  of  any  w-function  from  0  to  oo  is  positive,  its  k  is 
negative  for  values  of  s  >  1 .  In  other  words  the  negative  parts  of  (o  are  on  the 
whole  furthest  from  the  axis.    Again   the  absolute   value  of        decreases  as 

-777-^  when  s  increases,  or  the  higher  the  w-function  the  less  it  contributes  relative 
J  2s -I 

to  its  area  to  the  total  mean  square  deviation  of  the  curve. 

Applying  these  results  to  the  curve  of  scatter  given  by  (x),  i.e. 

50n-57 


,  /  .X     at/        1  1         671-  11  50: 


i892-2125ri  +  270?r 


0) 


3  — etc. )   (xxxiv), 


we  have  if  A  be  the  whole  area  and  k  the  radius, 

N    1         3  1      5  1       35   6U-11      21  50/i-57 

A  =  - -  -^1  +  + 

72770-  2  1      16  n 


1892-2125n  +  270/i^ 

49l52 


—  etc. I   (xxxv), 


Ale- [\-^--^  1        5    6n-ll       7  50?i-57 


72770-  2  1      \6n     24  li'     1024      it         3840  rt 
7      1892- 2125*1  + 270ri' 


49152 


+  etc.|  (xxxvi). 


Hence  if  we  even  neglect  terms  of  order  — , ,  we  see  that  the  Rayleigh  solution 

gives  too  large  an  area  for  the  curve  of  section  and  too  small  a  swing-radius ; 
these  values  are 

1  A^ 

Rayleigh  area,  — 7^-=^  >     Rayleigh  swing-radius,  o-, 

2  V  2770- 

lliV/3  1\  „          .                    1  /1\ 

I  rue  area  to  - ,     -  -?= — ( 1  ;  Irue  swmg-radius  to  -  ,  o-  IH  . 

n       2'j2Tra-\      16  nj                                          n  \  8n/ 

Accordingly  for  n  small  the  graph  of  the  Rayleigh  solution  tends  to  exaggerate 
the  concentration,  i.e.  using  it  as  an  approximation  we  shall  somewhat  reduce  the 
extreme  parts  of  the  curve  at  the  expense  of  exaggerating  those  near  the  centre 
of  dispersion. 

While  there  is  no  difficulty  about  determining  the  curve  of  distribution  when 
n  is  large  from  (xxxiv),  beyond  the  great  labour  of  dealing  with  hitherto  untabled 
functions,  the  investigation  becomes  very  troublesome  when  7i  is  small.  The 
functions  w  are  suited  in  this  case  to  represent  the  discontinuous  functions  which 
actually  form  the  values  of  4>n{^^'),  but  the  extreme  discontinuity  of  <^,j       for  n 
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small,  compels  us  to  use  a  very  great  number  of  w-functions,  and  the  convergency 
of  (xxxiv)  is  then  small. 

Another  method  of  determining  the  distribution  of  the  dispersed  population  has 
then  to  be  applied  to  the  case  of  n  small. 

(6)    Go'aj^hical  Solution  of  the  Fundamental  Problem  for  n  small. 
Let  us  consider  the  general  functional  relation  (i) 

{r')  =  ~  j'jcl>n  {r'  +  1'-  2rl  cos  6)  dd. 

Suppose  the  graph  of  ^„  from  0  to  nl  known.  This  may  be  any  discontinuous  function. 
From  nl  to  oo  ,  it  will  be  zero.    Let  ABD  be  the  graph  of       and  OA  the  axis. 


OP  =  r.  Round  P  describe  a  circle  of  radius  /,  take  the  radius  PQ,  so  that  the 
angle  OPQ  =  6  ;  then  clearly,  OQ'  =  r'  +  1'-  2rl  cos  6  ;  rotate  OQ  round  O  down  into 
line  OD,  as  ON;  draw  the  ordinate  of  the  graph  Nq,  then  we  have 

Nq  =     {r'  +  1'-  2rl  cos  6) 

and  <l>n.r{OP')=~rNqdd. 

Ztt  J  0 

Hence  if  we  divide  the  circle  up  into  a  number  of  equal  parts,  and  determine  the 
ordinates  Nq,  corresponding  to  each  of  them,  we  can  plot  a  curve  to  the  base  2w, 
of  which  the  mean  ordinate  will  be  (^„+i(OP'),  or  the  ordinate  at  r  of  the  new 
curve  of  dispersion  for  n  +  1  flights.  This  can  be  done  for  a  series  of  values  of 
r  from  0  to  n+  l  I  and  thus  <^„+,  (r')  will  be  determined  as  a  new  graph.    The  area 
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of  the  plotted  curve  which  gives  any  new  ordinate  can  be  found  mechanically. 
It  will  be  seen  that  the  process  is  theoretically  straightforward,  but  very  laborious. 
Thus  for  the  dispersion  curve  after  the  fourth  flight  some  43  points  had  to  be 
found,  and  this  involved  the  construction  of  43  subsidiary  curves  and  their  integration. 

There  were,  of  course,  graphical  difficulties  in  the  construction  of  the  subsidiary 
curves  in  the  neighbourhood  of  the  asymptotes  and  various  devices  had  to  be  used,  but 
at  almost  every  point  there  were  tests  of  the  accuracy  of  the  work.  Some  of  these 
I  shall  now  notice. 

Case  (i).    The  solution  for  two  flights  is  : 

,  /  ox     iV      1  \ 

TTrsiAL—r  ^  (xxxvii). 

=  0  r>2l  ] 

The  reader  will  find  no  difficulty  in  deducing  this  directly  from  the  case  of  n—\, 
which  corresponds  to  a  narrow  zone  of  radius  r  =  l,  the  rest  of  the  plane  being 
unoccupied.    Thus : 

N  '  ^ 

(f>i  =  ^Jrom  r  =  l-^€  to  r  =  l  +  ^e  I  . 

^TTte  {.  (xxxvn  ms), 

=  0  from  r  =  0  to  l  —  ^e  and  r=  /  +     to  oo  J 

e  being  taken  indefinitely  small. 

By  distributing  each  element  of  (f)^  on  the  zone  round  a  circle  of  radius  I  we 

obtain  (xxxvii). 

The  result  may  be  obtained  also  from  (iii)  by  putting  11  =  2,  i.e. 


^■2  {'>'")  =  ^     uJ,  (ur)  {J,  {ul)y  du, 

ATT  J  0 

_  N'[{2l  +  r)r{2l~r)r] 


—  from  r  =  0  to  21, 
277  J7t2-'U{-^) 

=  0  from  r  =  2/  to  co  , 

from  a  theorem  of  de  Sonin  by  putting  a  =  r,  h  =  c  =  l.  Compare  Gray  and 
Mathews,  p.  239,  Ex.  52. 

Case  (ii).  The  solution  for  three  flights  may  be  obtained  from  that  for  two, 
by  distributing  analytically  the  density  given  by  round  circles  of  radius  /  about 
each  point.  The  resulting  double  integral  is  then  expressible  in  elliptic  integrals*. 
We  find : 

where  k=  =  16ZV/{(r  +  If  {Si  -  r)], 

r>0  and  <  I ; 

-  ^  A  ^\  y  (xxxviii). 


27r'l  s/rl  \2 

where     =  (r  + 1)'  {Sl  -  r)/(  1 6ZV), 
r>l  and  <  Sl ; 
=  0  r  >  Sl  to  r  =  00 

*  This  solution,  or  its  equivalent,  was  first  sent  me  by  Mr  Geoffrey  T.  Bennett. 
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We  have  here  at  r  =  l  a  typical  instance  of  the  discontinuity. 

In  Table  I.  columns  (i)  and  (ii)  the  calculated  ordinates  of  and  ^3  are  given, 
the  latter  having  been  determined  by  the  use  of  Legendre's  Tables  of  the  Elliptic 
Integral  F.  In  these  cases,  as  in  the  later  values  of  the  ordinates  of  the  dispersion 
curves,  N  is  taken  as  unity.  The  dispersion  curves  are  plotted  in  Diagrams  I. 
and  II.  The  Rayleigh  solution  is  given  in  broken  line  ;  it  will  be  noticed  how 
very  far  it  is  from  representing  the  facts  at  this  early  stage  of  the  number  of 
flights.  One  of  the  most  interesting  features  of  the  investigation  is  to  mark  the 
gradual  approximation  of  the  discontinuous  series  of  functions  to  the  Gaussian  normal 
curve  of  errors  as  the  value  of  n  increases. 

The  first  test  of  the  graphical  method  of  dealing  with  the  problem  was  to 
start  from  the  curve  for  n  =  2  and  construct  the  graph  of  (^3.  The  result  was 
found  to  be  extremely  close  to  the  elliptic  integral  solution  obtained  by  analysis 
and  calculated  from  Legendre,  and  this  gave  us  every  confidence  in  the  correctness 
within  reasonable  limits  of  the  graphical  solution,  where  no  such  direct  verification 
was  possible.  After  the  ordinates  of  any  graph  had  been  found  their  differences 
were  plotted,  and  these  difference  curves  submitted  to  most  careful  inspection. 
Larger  irregularities  led  to  a  reinvestigatiou  of  the  points,  smaller  irregularities  were 
smoothed  with  the  spline,  and  from  the  final  smoothed  difference  curve  the  ordinates 
were  corrected. 

Another  test  was  now  possible.    In  every  case  2tt         [r')  r  dr  ought  to  be  unity. 

jo 

Each  ordinate  was  now  multiplied  by  its  r  and  a  quadrature  formula  used  to  find  the 
integral.  The  integral  would  usually  differ  verj^  slightly  from  unity.  Its  reciprocal 
was  then  used  as  a  factor  to  each  ordinate  and  the  ordinates  so  modified  were 
the  final  corrected  ordinates  of  the  corresponding  graph.  The  graphs  were  made 
on  a  large  scale,  and  the  accompanying  Table  L,  columns  (iii) — (vi),  gives  the 
ordinates  of  the  dispersion  curves  from  four  to  seven  flights. 

Additional  tests  were  as  follows  : 
Since  j>^^,  (r')  =  ^         (r^  +  P-  2lr  cos  d)  dd, 

it  follows  that  (/>„^,  (0)  =  ^         {I')  dd  =  cf>,  {1% 

or :  The  axial  ordinate  of  the  n+  1th  dispersion  curve  is  the  ordinate  at  a  distance  I, 
or  a  flight,  from  the  centre  of  the  nt\\  dispersion  curve.  Table  IV.  illustrates 
the  degree  of  accuracy  reached  here. 

The  ordinate  at  r  =  ?  of  the  seventh  curve  given  by  the  expansion  in  w-functions  is 
•0375,  and  this  is  precisely  the  value  of  the  central  ordinate  of  the  eighth  curve  given 
by  the  same  expansion.  Thus  the  graphical  method  runs  with  surprising  accuracy 
into  the  analytical.  The  Rayleigh  solution  gives  -0398  for  the  central  ordinate  of 
the  eighth  curve  as  against  the  '0375  of  the  oj-expansion,  or  the  -0378  of  the 
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Table  II.     Values  of  the  (o-functions. 


rja- 


0 
•1 
•2 
•3 
•4 
•5 
•6 
•7 
•8 
•9 
1-0 
1-2 
1-4 
1-6 

1-  8 

2-  0 
2-2 
2-4 
2-6 

2-  8 

3-  0 
3-2 
3-4 


+  -159,1550 
+  ■158,3611 
+  -156,0035 
+  -152,1517 
+  -146,9185 
+  -140,4537 
+  -13-2,9374 
+  -124,5713 
-115,5702 
■106,1526 
•096,5323 
■077,4690 
•059,7326 
•044,2510 
•031,4966 
021,5393 
014,1523 
008,9341 
005,4188 
003,1578 
001,7680 
000,9511 
+  ^000,491 6 


159,1550 
157,5693 
152,8834 
145,3049 
135,1650 
122,8970 
109,0087 
+  ^094,0513 
+  -078,5877 
+  -063,1608 
+  -048,2662 
+  -021,6913 
+  -001,1947 

-  -012,3903 

-  -019,5279 

-  -021,5393 

-  -020,0963 
--016,7961 

-  -012,8967 

-  -009,2208 

-  -006,1880 

-  -003,9185 

-  -002,3498 


+  -318,3100 
-313,5589 
•299,5891 
•277,2242 
■247,7634 
■212,8751 
■174,4670 
134,5401 
095,0449 
057,7497 
+  -024,1331 

-  -028,0128 

-  -057,3194 

-  -065,5623 

-  -058,4451 

-  -043,0786 

-  -025,8081 

-  -010,9496 

-  -000,5180 
+  -005,3253 
+  -007,5140 
+  -007,3562 
+  -006,0410 


•954,9300 
-935,9497 
•880,4201 
-792,4264 
•678,3356 
•546,1784 
•404,8965 
•263,5330 
+  -130,4593 
+  -012,7166 

-  -084,4658 

-  -206,6600 

-  -235,2026 

-  -194,3306 

-  -119,4328 

-  -043,0786 
+  -013,8001 
+  -043,9712 
+  -050,7478 
+  -042,6344 
+  -028,5090 
+  -014,7914 
+  -004,6874 


+  3-819,7200 
+  3-724,9378 
+  3-449,0302 
+  3-016,3079 
+  2-464,2124 
+  1-839,0999 
+  1-191,1906 
+  -569,3039 
+  -016,0640 

-  -435,8815 

-  -766,2251 

-  1-045,7098 

-  -900,0451 

-  -521,5103 

-  -116,5429 
-172,3144 
-295,4776 
-279,7081 
-188,3692 
-083,3863 
-003,6465 
-038,3979 
-048,6501 


+  19-098,6000 
+  18-530,6202 
+  16-885,5699 
4- 14-332,2150 
+  11-127,4046 
+  7-583,1582 
+  4-027,9574 
+  -767,7288 

-  1-947,9139 

-  3-949,8656 

-  5-161,4614 

-  5-351,9170 

-  3-455,1198 

-  -916,7705 
+  1-050,8391 
+  1-895,4584 
+  1-723,4411 
+  1-008,2741 
+  -246,6709 

-  -258,5489 

-  -439,7348 

-  -385,6461 

-  -231,6523 


+  114 
+  110 
+  99 
+  81 
+  59 
+  36 
+  13 

-  5 

-  21 

-  30 

-  35 

-  28 

-  12 
4 

12 
12 
7 
1 
2 
3 
2 
1 


-591,6000 
■620,7235 
•177,8011 
•601,7164 
■905,9470 
489,3471 
•802,7339 
■975,6743 
205,3207 
-951,7904 
-039,7166 
874,9613 
•119,1731 
-126,7483 
•770,4428 
•751,2656 
-400,1858 
•194,4811 
•829,6050 
■915,1831 
■949,4384 
■308,1481 
007,0283 


I  owe  this  preliminary  table  of  co-functions  to  the  kindness  of  Dr  Alice  Lee.  Much  more 

elaborate  tables  will  have  to  be  calculated,  if  as  I  anticipate  the  w-functions  are  found  valuable  for 

other  purposes.  The  present  table  suffices  to  indicate  their  general  numerical  character,  and  enables 
one  to  calculate  some  of  the  quantities  needed  in  the  present  memoir. 


graphical  construction.  The  fact  that  the  central  ordinate  of  the  sixth  curve  is 
almost  identical  with  the  ordinate  of  the  fourth  curve  at  r  =  /,  seems  conclusive  as  to 
the  general  accuracy  of  the  process. 

The  above  test  of  the  general  accuracy  of  Mr  Blakeman's  graphical  work  is  only  a 
part  of  the  still  more  sufficient  test  that  in  the  seventh  curve  the  graph  and  the 
&)-expansion  practically  coincide.  See  Diagram  VI.  After  r=5l  the  two  curves 
cannot  be  distinguished,  and  between  r  =  0  and  Si  the  deviation  is  probably  as  much 
due  to  the  neglect  of  higher  w-functions  as  to  errors  in  the  graphical  treatment. 

Another  method  adopted  by  Mr  Blakeman  for  testing  the  accuracy  of  his 
graphical  work,  especially  at  the  end  of  the  range,  was  to  obtain  expansions  to 
4*n  (^').  when  r  does  not  differ  much  from  nl,  =nl  —  ^,  say,  where  ^  is  supposed  small. 
If  fn  {i)  =     {{nl  —  if),  then  generally  for  ^  small : 

f  (^)  =   ^!   A         IJJ,...I^_,  (xxxix), 

-^"^^^     l'{j2Y^-''rT''-'Jn\l}  ^  ^  ^      -  ^  V  ' 

where  =  |"''cos*  6'£^6' =  r  (i  (n  +  1))  T  (|)/r(i  (n  +  2)). 
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Table  III,    Table  of  the  v  constants. 


ni=l 

n  =  6 

n  =  7 

n  =  8 

"10 
"12 

-•041,666,667 
-  •003,086,420 
•000,602,816 
•000,104,167 
+  ^000,001, 412 

-  ^035,714,286 

-  ^002,267,574 
•000,470,724 
•000,067,796 

-  ^000,000,142 

-  -031,250,000 

-  -001,736,111 
-000,376,383 
-000,046,522 

-  -000,000,639 

Table  of  the  N  constants. 

m  =  5 

n  —  & 

n  =  l 

m  =  8 

^4 
^  AT 

-  •008,333,333 

•000,032,600 
•000,000,990 

-  ^000,000,078 

-•007,142,857 
•nnn  aqo  vn^ 

•000,024,174 
•000,000,627 
-  -000,000,047 

-  -006,250,000 
•Ann  OAQ  All 

-000,018,636 
-000,000,422 

-  -000,000,033 

TO  =10 

n  =  6 

11=8 

^4 

-•004,166,667 
-  -000,030,864 
•000,008,415 
-000,000,126 
--000,000,011 

-  -003,571,429 

-  -000,022,676 
■000,006,211 
•000,000,082 

-  ^000,000,007 

-  ^003, 125,000 

-  ^000,017,361 
•000,004,771 
•000,000,053 

-  -000,000,006 

m=20 

n  =  6 

n  =  7 

m  =  8 

^4 
^6 

J^s 

^.0 
^12 

-  -002,083,333 

-  ^000,007,716 
•000,002,137 
•000,000,016 

-  ^000,000,0014 

-  •OOl, 785,714 

-  •000,005,669 
•000,001,574 
•000,000,010 

-  -000,000,0009 

-  -001,562,500 

-  -000,004,340 
-000,001,207 
-000,000,007 

-  -000,000,0006 

Table  IV.     Central  ordinates  and  ordinates  at  r  =  I. 


No.  of  Flights 

Central  ordinate 

Ordinate  at  r  =  Z 

First   

0 

00 

Second   

00 

•0585 

Third  

•0585 

00 

Fourth 

00 

•0537 

Fifth  

•0537 

•0537 

Sixth  

•0538 

•0415 

Seventh   

•0415 

•0378 
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This  can  be  proved  by  induction. 

For  most  of  the  cases  more  approximate  formulae  still  were  deduced.    Thus  : 

^'^^^^        7f  + r8  T  +  ]^  y )  (^1^)' 

^^(^)=wW^(Tr(^-^^T+"-)  (^i^ii)^ 

after  which  the  first  term  only  as  given  by  (xxxix)  is  sufficient.  It  will  be  observed 
that  after  ^5  (r'),  the  curve  touches  at  r  =  ?i/  or  ^=0,  and  the  contact  becomes  higher 
and  higher  as  n  increases.  Thus,  although  short  of  n  =  00  ,  there  is  no  real  asymptot- 
ing  to  the  axis,  still  ^„  {r')  for  n  >  5  not  only  vanishes  for  r  —  nl,  but  has  increasingly 
higher  contact  as  n  increases.  This  explains  how  the  Gaussian  curve  can  fairly  well 
represent  the  state  of  affairs  towards  the  end  of  the  dispersal  range,  if  w  is  >  5. 

Mr  Blakeman  found  that  the  ends  of  the  range  for  the  various  cases  ran  closely 
into  the  curves  (xl)  to  (xliii),  and  they  were  tested  and,  if  needful,  corrected  by 
these  formulae. 

Thus  the  whole  graphical  work  was  kept  in  check,  and,  I  think,  we  may  be 
confident  that  the  true  forms  of  the  dispersal  curves  for  n  =  4  to  7  are  really 
given  by  our  diagrams  and  tables. 

(7)    We  may  note  a  few  features  of  these  curves. 

Dispersal  Curve  for  Two  Flights  (Diagram  I.). 

There  is  no  discontinuity  in  the  solution  from  r--=0  to  2l,  the  range  within 
which  all  individuals  fall.  The  curve  asymptotes  to  the  vertical  at  the  axis  and 
at  r  =  2l.  Of  course,  while  the  density  becomes  infinite,  the  number  on  any  small 
area  near  r  =  Q  or  r  =  2l,  is  finite.  Thus  the  number  between  the  circles  of  radii 
i\  and  r^  is 

^(^^^    2l-^^^  21 
If  ?\  =  0  and  r2  =  ei,  where     is  small,  the  number     within  the  small  circle  of  radius 
at  the  centre  of  dispersion  =  iVej/(7r/).    l{  7\  =  r.,  +  e^,  the  number  lying  on  the  zone  of 

iVe  /       7'  ^  \  ~  ^  A^    /  e 

breadth     is  -~  (  1  "  ^2  )     »  and  this  if  r^  ^2l  -     is  v,  =  —  J .    At  the  position  of 
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minimum  density  =  J 2,1,  and  the  number  on  the  zone  i\  to  r,  +  e,  is  v,  =  NeJ{TTlJ\l2). 
Hence  it  follows  that  the  numbers  on  narrow  zones  e^,  e^,  63  in  breadth,  of  equal  areas 
7re/ =  774/62  =  77-2  \/2?e3,        given  by 

iVe,/(7r/),    NJI,I{7tJI),    and  NeJ^liirl), 

or  in  the  ratio 

NeJ{7rl)  :  im.l^trl),    ^NeJ^rrl)  X  ^. 

Thus  the  total  population  on  a  small  area  at  the  centre  of  dispersion  is  twice  that  on 
an  equal  area  at  the  periphery  of  the  distribution,  and  at  both  indefinitely  greater 
than  on  an  equal  belt  at  the  distance  of  minimum  density.  The  same  point  can 
be  indicated  in  another  way.  From  r  =  0  to  r  =  is  of  the  total  area  occupied 
after  dispersion,  it  contains  'IQN  or  about  ^  of  the  total  population;  from  r  =  ^l  to 
r  =  2l  is  3^  of  the  total  area,  it  contains  '54  In  other  words  the  half  of  the 
area  nearest  and  farthest  from  the  centre  of  dispersion  contains  of  the  dispersed 
population ;  the  "  middle "  half  of  the  area  contains  only  of  the  population. 
The  nature  of  the  distribution  is  thus  extremely  different  from  that  given  by  the 
rotation  of  the  Gaussian  curve  about  its  axis  for  this  small  number  of  flights. 
For  in  the  Gaussian  case  if  the  central  area  ire^  =2tti\€2,  the  area  of  the  zone  at 
distance  r^,  the  population  on  the  centre  patch  is  ^Ne^jcr  and  on  the  zone  is 

which  is  always  less  and  diminishes  continuously  with  increase  of  r^.  Thus  the 
Rayleigh  solution  fails  in  this,  as  in  the  next  three  cases,  not  only  to  give  the 
form  of  the  curve  at  dispersion,  but  to  indicate  that  the  dispersed  populations  on 
zones  of  equal  area  round  the  centre  do  not  decrease  uniformly  in  number. 

Dispersal  Curve  for  Three  Flights  (Diagram  II.). 

The  solution  is  discontinuous  at  r  =  l.  The  density  is  here  infinite,  but  has 
become  finite  at  the  origin.  There  is  no  discontinuity  at  r  =  2l,  but  at  the  end 
of  the  range  the  density  drops  suddenly  from  a  finite  value  to  zero.  Thus  the  in- 
tegral of  the  Bessel  function  product  (see  Eqn.  (iii) )  is  discontinuous  at  two  points. 
The  Rayleigh  solution  is  still  widely  divergent  from  the  true  curve  of  dispersal. 

Dispersal  Curve  for  Four  Flights  (Diagram  IIL). 

By  the  rule  already  referred  to  (p.  18)  the  infinite  density  has  returned  to 
the  origin.  There  are  only  two  points  of  discontinuity,  i.e.,  at  7'  =  ?  and  r  =  il 
the  end  of  he  range,  at  both  of  which  there  is  an  abrupt  change  in  the  slope  of 
the  curve.  The  density  at  the  end  of  the  range  is  now  zero  and  will  remain  so, 
but  the  dispersal  curve  rises  at  right  angles  to  the  axis.  The  true  dispersal  curve  is 
bending  round  somewhat  to  the  Rayleigh  curve,  but  the  latter  is  not  even  yet  a 
rough  approximation  to  the  facts. 
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Dispersal  Curve  for  Five  Flights  (Diagram  IV.). 

All  infinite  densities  have  now  finally  disappeared.  The  density  vanishes  at 
the  end  of  the  range,  but  the  dispersal  curve  makes  a  finite  angle  with  the  horizontal 
axis.  There  is  a  marked  discontinuity  of  slope  at  r  =  l\  a  still  more  noteworthy 
feature  is  that  from  r  =  0  to  r  =  l  the  graphical  construction,  however  carefully 
reinvestigated,  did  not  permit  of  our  considering  the  curve  to  be  anything  but 
a  straight  line.    If  this  could  be  verified  from  the  analytical  expression 

^4  {r-)  =  ^     uJ,  (ur)  {J,  {id)y  du 

LIT  J  0 

by  showing  that  the  integral  is  independent  of  r  from  0  to  /  it  would  be  of  much 
interest.  Even  if  it  be  not  absolutely  true,  it  exemplifies  the  extraordinary  power 
of  such  integrals  of  J  products  to  give  extremely  close  approximations  to  such  simple 
forms  as  horizontal  lines. 

The  approach  of  the  Rayleigh  curve  to  the  result  is  now  more  noticeable. 

Dispersal  Curve  for  Six  Flights  (Diagram  V.). 

There  is  contact  now  of  the  first  order  at  the  end  of  the  range.  From  r  =  0 
to  r  =  l  the  curve  of  dispersal  appears  to  be  a  sloping  straight  line  tangential  to 
the  continuous  curve  from  r  =  l  to  r  =  6l.  No  other  discontinuity  of  a  low  order 
is  now  visible.  The  curve,  except  for  the  finite  slope  at  r  =  0,  is  becoming  much 
more  of  the  Gaussian  form.  It  runs  fairly  closely  to  the  solution  in  w-functions  up 
to  (Ui2,  in  fact  is  not  separable  at  the  extreme  part  of  the  range,  where  the  Rayleigh 
curve  still  gives  finite  ordinates  beyond  the  possible  range. 

Dispersal  Curve  for  Seven  Flights  (Diagram  VI.). 

All  sign  of  discontinuity  has  gone,  the  curve  is  horizontal  at  the  centre  of 
dispersion  and  might  be  easily  mistaken  for  a  normal  curve  of  errors.  The  expansion 
in  oj-functions  represents  the  result  within  the  limits  almost  of  constructional  error. 
It  was  not  thought  necessary  to  continue  the  graphical  work  beyond  this  stage. 
We  may  conclude  that : 

Tlie  deviation  of  the  Rayleigh  solution  for  seven  and  more  flights  from  the 
true  dispersal  curve  is  practically  the  same  as  its  deviation  from  the  solutioii  in 
oi-functions  when  five  terms  of  that  series  are  retained. 

This  I  think  completes  the  full  solution  of  the  fundamental  problem.  The  dispersal 
curves  for  the  cases  of  2  to  7  flights  are  given  in  the  Table  I.  of  ordinates  and  the 
Diagrams  I.  to  VL  For  higher  values  the  w-function  series  gives  the  solution.  This 
solution  could  be  applied  to  calculate  the  ordinates  of  the  dispersal  curve  for  fewer 
flights  than  6  or  7,  but  several  more  co-functions  would  have  to  be  used  and  the 
arithmetical  work — especially  while  these  functions  are  as  yet  untabled  then 
becomes  somewhat  severe. 

*  Table  II.  provides  a  preliminary  series  of  values  of  the  oj-functions. 
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(8)    Secondary  Migration  Problems.  * 

Problem  I.  On  one  side  of  a  straight  line  there  is  supposed  to  he  a  uniform 
distribution  of  habitats;  on  the  other  at  starting  no  habitats.  To  investigate  the 
distribution  in  the  unoccupied  area  after  one  migration.  Each  individual  is  supposed 
to  take  n-fights  to  the  new  habitat. 


Let  YYhe  the  straight  hne  and  O  a  point  at  distance  c  from  it  on  the  unoccupied 
side  of  it.  Let  N  be  the  average  density  per  unit  of  area  on  the  occupied  side.  Then 
after  an  n-flight  migration,  the  contribution  from  P  (co-ordinates  r,  ^)  at  0  will 
be  iV?'S^8r  (r^),  and  integrating  this  all  round  a  circle  of  radius  r  from  A  to  C 
within  the  occupied  area,  we  have  for  the  quantity       (c)  at  O 

/"cos  ~'c/r 

F,,{c)  =  2N\    I  <l>^{r^)rdxdr 
=  2N\   cos  ^  cjr  (f)n  (r^)  r  dr. 


Hence 


The  evaluation  of  this  integral  needs  a  further  consideration  of  the  w-functions. 
By  (xiv) 
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Transfer  the  differentiations  from      to  cr^  and  we  have  : 


=  -  

or,  all  the  w-functions  can  be  found  by  differentiating  the  first  co-function  with  regard 
to  the  standard-deviation  squared.    Then  by  (xii)  we  have 

(-)  =  1 1  +     {<^r  J~y  -    i<rj  +...+(-  1 )       (cT  +  } 

 (xlvi). 

Thus,  if  we  put  cr^  =  ^  : 
But:  /\.(o.  +  ,')  c^,= 


STT  cr 


\/27r  cr 


Now    I    e      ''^  dc  =  a-  \    e~^^  dx,  and  this  integral  vanishes  when 

J   CD  J  CD 


c  =  00  .  Hence 


 (xlvii).. 

Since  (c)  clearly  vanishes  for  c  infinite,  it  is  not  needful  to  introduce  a 
constant. 

It  remains  accordingly  to  determine  the  successive  differentials  of  the  integral 
with  regard  to  t.    Call  the  integral  i;  then,  if  r)  =  clo-  =  clJt, 

di         -\rf-  drj     1  c   ^c'l2t       c  I  . 

dt=-'  ^=2?"  =77-0.0  =  WV^. 

By  (xlv)  we  know  that  d'wjdt' =  {  —  iyQ)Jt\  Hence  differentiating  s  —  \  times 
we  have  : 

dH  _7rc  /dr\o,  _  _  .d'-'oj,l  (g-l)(g-2)  d'-'o},l.3  1 
df~7t\dr^  ^~dr^2t'^         2!  df-'2.2f 


i-iy-'ncf  ^,  ,  .  (g-l)(g-2)  1.3 
=  — —  ( <>>2(s-i) +  1-5-     w2(^_2)Yi  «^2(s-3) 


(,-l)(,-2)(.-3)  1-3.5 
+  1:273  2^  +...etc. 


\ 
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Thus     1^  =  (  -  ^  +  ^-^crX(.-2,  +       212"^^'^  •  3 •  ^^'^^(-B) 

+  ^'"^^^3~!^^'"^^  1.3.5  a^co,,^,,  +  etc.)   (xlviii). 

Substituting  in  (xlvii)  we  have,  if  ^(''^)~j2^\    e'^^'^dx  (xlix), 

(c)  =  N  \^  (^^  -^Y-  {^^^0  (i^.  +  f   +  -¥^s  +  + 

+  0-'ft>8  (I'jo  +  I" ^^12)  +  O-'<^10^12  +  •  •  •  }   (1). 

as  far  as  coefficients  of  the  order      and  functions  of  order  Wjg. 

This  is  the  solution  in  w-functions.  Table  III.,  p.  21,  gives  the  values  of  the  vb  for 
certain  values  of  n,  and  Table  II.,  p.  20,  is  a  preliminary  table  of  the  w-functions. 
These  will  enable  us  to  readily  find  the  values  of  (c).  I  have  done  this  for  the  case 
of  ri  =  6  and  n  =  7,  which  will  suffice  to  illustrate  the  character  of  these  curves. 
t|/  (c/o-)  can  be  found  at  once  from  Tables  of  the  probability  integral.  It  is  drawn 
with  a  broken  line  in  Diagram  VII.  and  is  the  Rayleigh  solution  for  this  case.  I  term 
Fn[c)  an  "infiltration  curve"  of  the  first  order. 

Substituting  the  values  of  the  v's  from  Table  III.,  we  have  for  n  =  Q: 
F,{c)/N^xfj  (-)  +-{-026,712,414 (o-^ft>o)+ •053,325,539  (cr^) 


cr 

+  -002,114,303  (o^w,)  -  -001,029,  898  (o-V) 
-•000,134,978  (o-X)- "000,001,770  (crVo) +  •••}, 

and  for  71  =  7 : 

F,{c)/N  =  ^        +  ^{-022,850,925  (cr^a>o)  + '045,644,347  {cr'co,) 

+  -001,578,008  (a'co,)  -  -000,758,570  (o-^) 

-  -000,084,525  {a'm)  +  -000,000,178  (0-^0)  +  — }• 

The  first  term  xfj  |^-)  is  the  ogive  curve  already  drawn  corresponding  to  the 

Rayleigh  solution.  We  see  at  once  that  the  term  o-^Wjo  will  not  affect  the 
fourth  place  of  decimals. 

4—2 
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Table  V.    Ordinates  of  Infiltration  Curve  over  straight  Boundary. 


+  cl<r 

n  =  6 

n  =  7 

n=  00 

-  c/o- 

n  =  7 

n  =  oo 

0 

•5000 

•5000 

•5000 

— 

— 

— 

— 

•1 

•4614 

•4612 

•4602 

-  -1 

•5386 

•5388 

•5398 

•2 

•4231 

•4228 

•4207 

-  -2 

•5769 

•5772 

•5793 

•3 

•3854 

•3850 

•3821 

-  •s 

•6146 

•6150 

•6179 

•4 

•3488 

•3483 

•3446 

-  -i 

•6512 

•6517 

•6554 

•5 

•3135 

•3128 

•3085 

-  -5 

•6865 

•6872 

•6915 

•6 

•2797 

•2790 

•2743 

-  •e 

•7203 

•7210 

•7257 

•7 

•2478 

•2469 

•2420 

-  -7 

•7522 

•7531 

•7580 

•8 

•2177 

•2169 

•2119 

-  -8 

•7823 

•7831 

•7881 

•9 

•1898 

•1889 

•1841 

-  -9 

•8102 

•8111 

•8159 

1-0 

•1640 

•1632 

•1587 

-l-O 

•8360 

■8368 

•8413 

1-2 

•1193 

•1186 

•1151 

-  1-2 

•8807 

•8814 

•8849 

14 

•0834 

•0830 

■0808 

-  l-i 

•9166 

•9170 

•9192 

1-6 

•0558 

•0557 

•0548 

-1-6 

•9442 

•9443 

•9452 

1-8 

•0356 

•0356 

•0359 

-  1-8 

•9644 

•9644 

•9641 

2-0 

•0215 

•0214 

•0228 

-2-0 

•9785 

•9786 

•9772 

2-2 

•0121 

■0124 

•0139 

-  2-2 

•9879 

•9876 

•9861 

2-4 

•0064 

•0066 

•0082 

-2-4 

•9936 

■9934 

•9918 

2-6 

•0030 

•0033 

•0047 

-2-6 

•9970 

•9967 

•9953 

2-8 

•0015 

•0013 

•0026 

-2^8 

•9985 

•9987 

•9974 

3-0 

■00046 

•00060 

•00135 

-3-0 

•99954 

•99940 

•99865 

3-2 

•00012 

•00020 

■00069 

-3^2 

•99988 

•99980 

•99931 

3-4 

•00000 

•00004 

■00034 

-3-4 

1  -00000 

•99996 

•99966 

n  =  00  is  used  to  denote  the  Rayleigh  solution. 


This  table  suggests  some  interesting  points.  The  curves  for  n  =  6  and  7i  =  7  are 
fairly  close  together,  but  differ  sensibly  from  the  Rayleigh  solution,  perhaps 
4  or  5  per  cent.,  where  the  density  is  at  all  material.  For  many  practical 
purposes  this  might  be  close  enough,  and  we  see  that  for  infiltration  as  distinct 
from  dispersal  curves,  the  Rayleigh  solution — owing  to  integration  over  an  area 
— gives  fairly  close  results.  The  greatest  percentage  deviations  from  the  Rayleigh 
solution  are  to  be  found  in  the  tail.  Now  no  individual  can  be  found  beyond 
the  range  fil  from  the  boundary,  and  (t  —  J^uI;  thus  the  maximum  range  is  J'Zna-, 
or,  for  n  =  %  and  7,  the  maximum  range  is  3'46cr  and  S'Via  respectively.  The 
cu-function  expansion  brings  this  out  well.  For  7i  =  6  at  3*4o-  there  is  not  one  in 
100,000  individuals,  while  the  Rayleigh  solution  gives  34.  For  n  =  7  there  are 
still  4  in  the  100,000,  because  we  are  a  little  distance  still  from  the  limit  of 
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the  range.  The  Rayleigh  solution  continues  to  give  sensible  densities  beyond 
the  range,  although  they  may  be  sufficiently  small  to  be  neglected  in  practice. 

For  rough  purposes  a  first  approximation  to  the  infiltration  curves  may  be  found 
from  the  Rayleigh  solution,  they  will  err  on  the  side  of  safety  if  we  are  con- 
sidering the  effect  of  a  clearance  at  a  considerable  distance  from  the  boundary. 
But  with  the  aid  of  the  tables  of  the  w-functions  and  the  t'-coefficients,  it  is 
not  difficult  to  obtain  the  actual  form  of  the  infiltration  curves  as  I  have  done 
in  the  present  case.  Diagram  VII.  compares  the  Rayleigh  approximation  and  the 
infiltration  curve  for  n  =  7 . 

It  will  be  seen  that  an  infiltration  curve  of  the  first  order  gives  not  only  the 
density  of  the  population  after  a  first  migration  into  cleared  or  unoccupied  area 
across  a  straight  boundary,  but  also  the  diminution  of  density  on  the  populated 
side  of  the  area,  when  we  put  c  negative,  i.e.  it  gives  both  the  'depopulation' 
and  'repopulation.'  The  reduced  density  at  the  boundary  is  ^N,  and  if  we  take 
the  point  where  the  infiltration  curve  cuts  the  vertical  through  the  boundary  as 
origin,  we  see  that  it  is  centrally  symmetrical;  or  the  loss  of  population  at  a 
given  distance  from  the  boundary  is  exactly  equal  to  the  gain  at  the  same 
distance  on  the  opposite  side  of  the  boundary. 

If  we  require  an  infiltration  curve  of  the  second  order,  we  must  now  multiply 
the  ordinates  of  the  curve  of  the  first  order  by  (i)  the  average  fertility  of  the 
species,  say  /x,  and  (ii)  the  survival  rate  A.  If  the  environment  be  the  same  on 
either  side  of  the  boundary,  and  neither  /x  nor  A  affected  by  the  density  of  the 
population,  then  juA  may  be  treated  as  a  constant  and  the  infiltration  curves  of 
higher  orders  can  be  found  with  moderate  ease  for  simple  cases.  We  thus  have 
the  distributions  after  two,  three  or  more  migrations  accompanied  by  reproduction 
and  death.  On  the  other  hand  both  and  A  may  be  functions  of  the  density 
of  the  population,  and  in  this  case  the  ordinates  of  the  infiltration  curves  of  the 
second  and  higher  orders  can  only  be  determined  when  the  nature  of  /u,  and  A 
is  known.  On  the  whole  it  is  probable  that  the  average  fertility  depending 
on  the  mating  frequency  will  be  highest  where  the  density  is  greatest,  as  mating 
opportunities  will  then  be  most  frequent,  but  in  such  cases  the  survival  rate 
A  may  be  lower,  as  more  enemies  are  likely  to  be  present  and  the  food  supply 
is  also  likely  to  be  less,  where  the  population  is  densest.  Thus  )u,A  as  a  whole 
may  not  be  very  different  on  the  depopulated  and  repopulated  sides  of  the 
boundary.  "W  e  shall  only  consider  in  this  memoir  cases  in  which  this  product 
is  (i)  supposed  constant  throughout,  or  (ii)  constant  for  each  migration  season ;  but 
supposing  uniform  environment  on  both  sides  of  the  boundary,  it  is  conceivable 
that  /xA  will  be  correlated  with  the  population  density  and  this  will  modify  the 
basis  of  the  distribution  from  which  the  second  and  later  migrations  start. 
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(9)  Problem  II.  To  investigate  the  distribution  after  m  migrations  from  uni- 
formly densely  occupied  space  across  a  straight  boundary  into  unoccupied  space. 

Let  the  axis  of  x  be  taken  perpendicular  to  the  boundary  and  the  axis  of 
y  be  the  boundary.  Let  us  consider  the  density  at  x  =  c,  on  the  originally 
unoccupied  side  of  the  boundary.  Then  the  density  at  a  distance  x  from  the 
boundary  is  given  by  (xlvii),  or  if  we  wite  the  operator  as  Qt,  we  have 

1        T""      -1  2 

F,{x)  =  NQt-j=      e       dx  =  u„  say,  (li). 

V  ZtT  J  xl<r  ^  ' 

Here       involves  only  n  and  cr  and  not  x. 

Now  the  distance  r  from  the  point  x,  y  to  the  point  c,  0  at  which  we 
want  the  density  after  the  next  migration  is  given  by  : 

r^  =  y'^  +  {x  —  c)\ 

and  /aA  being  the  fertility-survival  factor,  we  have  for  the  density  at  c, 

r  +  co  r+aa 

^2  =  )aAWi</)„  (r^)  dxdy. 

J  —CD  J  —00 

Now  =  Q,co,  =  Q,  ^e-^^'K 

To  mark  that  this  Qt  operates  only  on  this  part  of  the  expression,  write  it 
Q/  and  suppose  it  to  operate  on  cr'  written  for  cr.  After  the  operations  are  com- 
plete we  can  put  cr'  again  =  a.  Let 

---  -T=       e  ^  ax. 

J^TT  }  xl<T 

Then  if  fxA  be  constant  (see  p.  29): 

2tt  J  -oo  J  -co  cr 

Completing  the  integration  with  regard  to  y  we  have : 

u^  =  !^Q,Q/rvAe-i^'-'^'l'-dx. 

VZTT  j  -00  O" 

Differentiate  with  regard  to  c  : 

dc      7277  j-o,   'a-'dx^  ' 

Integrate  by  parts,  and  notice  that  the  part  between  limits  vanishes  at  both 
of  them  and  we  have  : 

dc      J 277  J -00  dx  cr' 

But  1  l^-i(./.)2. 

hence :  ^^■=  -H^  ftft'  ^  ,  f'"  e-*(?+'^)  rfx, 

dc  27r  o-cr'j.oo 
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This  is  integrable  and  gives : 

Integrate  with  regard  to  c,  and  remember  that  11^  =  0  if  c  =  oo  ;  thus : 
\l  Ztt  J  c  si  a-- +  a ' 

.  =^AiV^,^/-^r    (Hi). 

Comparing  this  with  (U)  we  see  that      differs  from      by  {a)  the  introduction  of 
the  factors  Qt  and  /xA  and  (6)  the  replacement  of  a  in  the  lower  limit  by  Ja~  +  (T''\ 
The  process  can  therefore  be  repeated  as  often  as  we  please,  and  we  have  for 
the  value  : 

u^  =  {lxA)"'-'NQ,Q/Q/'...  to  m  terms  -y=       e"*-""o?x',  . 

V  ZTT  j  c/S 

where  S  =  (t  +  a''^  +  a"^  +  ...  to  m  terms. 

After  the  operations  indicated  by  the  ^'s  are  completed,  we  are  to  put 


Now  it  is  clear  that  a  differentiation  with  regard  to  any  o-^  is  precisely  the 
same  as  one  with  regard  to  S^  We  can  therefore  write  for  all  the  ^'s  the 
simple  expression 

understanding  that  d/d{%^)  operates  only  on  S  and  that  after  the  operation  is 
completed  we  can  put  %  =  Jma:    Thus  the  complete  solution  is: 

/  r/^  d^  X*" 

«»  =  ^ (^A)-  (l  +  ^.  (-•)•  -  +...+(-  1  +  ...) 

-7=       e       dx   (liii). 

s/2ir  jc/-z  ^  ^ 

This  is  true  for  c  positive  or  negative,  i.e.  whether  the  density  be  considered 
at  a  point  on  the  originally  occupied  or  originally  unoccupied  side  of  the  boundary. 

Up  to  terms  of  order         we  have  for  the  operator  the  value 

1  +  m  {v,q^  -  v,q'  +  v^q'  -  v,,q'  +  v,,q') 

'^^'l'~^\i^.Y-2v,i.,q^  +  2v,v,q^) 

^      — — where  q  stands  for  a^d/d  {S.^). 


+ 
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Now  exactly  as  on  p.  26  we  may  show  that : 


where  :  (77^)  =  x^.s-d  ('7™)  +  ^^^^  .  X-...-.  iVm)  +  ^^"2!  2~^^  ^  '  ^  "  ^^'-^  ('^-) 

(.-l)(.-2)(.-3) 

^  g  (  23  X.O.J.  X2te-4)  Wm^  i-  •  •  •  , 

>7m  =  c/(Vwo-),  and       is  defined  on  p.  10,  Equation  (xviii). 


Thus 


{mt/3  +  -^7>i(m-l)i//}  mz/,o  +  m(m-l)t/,t/, 

 ~ — mr~   V'AVm)  +  ~,  xl^siVm) 

mv,,  +  m{m-l)v,v,  +  l'm{m-l){m-2)v;  ,    .    ,\1  .... 
+   'I'loiVnOjy   (hv). 

We  see  that  this  expression  converges  much  more  rapidly  than  that  for  ^„  (r^), 
if  m  be  at  all  large. 

The  result  (liv)  might  have  been  reached  in  a  different  manner.  We  might 
have  supposed  the  (/AA)'""Wa  individuals  to  have  started  from  any  element  a 
on  the  populated  side  of  the  boundary  and  taken  mn  flights  without  multiplying 
to  their  final  resting-place.  The  effect  of  this  would  be  that  o-^  =  -JmwZ^  and  that 
in  the  values  of  the  v's  we  must  write  mn  for  71.  But  doing  this  gives  us 
precisely  the  coefficients  of  the  ijj's  in  (liv).  Thus  (liv)  is  deduced  directly 
from  (xlix).  The  proof  becomes  then  much  shorter,  but  it  is  more  artificial ; 
the  fact  that  we  may  suppose  all  the  unborn  individuals  to  scatter  from  the 
original  centre  is  not  so  easily  realised,  and  further  it  does  not  in  the  process 
picture  what  takes  place  until  the  final  arrangement  after  the  with  breeding  cycle 
is  attained.  In  the  method  I  have  adopted  we  see  the  exact  process  of  each 
breeding  multiplication,  its  increase  of  the  operating  factor  by  an  additional  Qf, 
and  its  increase  of  the  square  of  the  standard  deviation  by  an  additional  <t^. 
Lastly  the  final  form  (liv)  enables  us,  without  recalculating  the  v's  for  each 
breeding  cycle,  to  see  very  easily  the  effect  in  the  case  of  any  n-flight  species,  of 
taking  any  number  of  breeding  cycles. 

So  long  as  we  keep  />tA  constant  of  course  our  result  for  m  breeding  cycles 
with  n  flights  will  be  the  same  as  for  a  simple  scattering  for  mn  flights  of  a 
larger  number  of  individuals.  If  /xA  varies,  however,  we  must  adopt  the  method 
indicated  in  the  above  proof,  and  work  out  each  migration  successively.  The 
same  method  must  be  adopted  if  a  patch  be  rendered  permanently  sterile,  because 
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in  such  a  case  fi  is  not  constant  for  all  parts  of  the  integrated  area,  and  we 
cannot  suppose  the  whole  final  population  to  scatter  from  the  original  centres. 

If  we  neglect  the  xfj,,  xjj,...  terms  in  (liv)  we  have  the  value  which  would 
follow  from  the  Rayleigh  solution  of  the  fundamental  problem,  and  this  can  be 
very  readily  expressed  in  geometrical  terms.  For  we  mark  at  once  that  u,  and 
are  in  type  identical  curves.  Take  u,  and  stretch  it  vertically  in  the  uniform  ratio 
of  /aA'""'  to  1,  and  horizontally  in  the  ratio  of  Jm  to  1,  and  it  becomes  u^. 
In  other  words  the  broken  line  on  Diagram  VII.  represents  the  approximate  solution 
in  this  case  after  m  migrations  provided  we  read  iVdoiA)™-'  for  N  on  the  vertical 
scale  and  t  =  Jm(T  for  a  on  the  horizontal  scale.  The  Table  on  p.  33  gives  the 
chief  results. 

The  unit  of  this  table  is  the  length  /  of  "  flight."  It  will  be  desirable  to  illustrate 
its  application.  Any  such  application  can  be  of  course  only  a  suggestion,  and  on 
this  account  the  above  Table  has  been  calculated  to  only  a  few  places  of  decimals. 
But  such  suggestions  may  not  be  without  value.  They  will  become  more  than 
suggestions  when  our  knowledge  is  greater  of  the  migratory  habits  of  diflerent 
species.  At  present  only  rough  approximations  can  be  made  as  to  the  values 
of  n  and  I,  and  these  admittedly  are  of  small  weight. 

Illustration  I.  In  captivity  I  have  noted  that  H.  aspersa  will  live  for  five 
years.  For  two  years  it  does  not  usually  lay  eggs,  and  then  it  will  generally, 
but  not  invariably,  reproduce  twice  in  the  year.  This  is  of  course  subject  to 
claustral  conditions,  and  while  these  seem  in  some  cases  unfavourable,  in  others 
they  may  be  advantageous  both  in  matter  of  longevity  and — owing  to  the  constant 
food  supply^ — in  number  of  broods.  This  snail,  as  far  as  my  observation  goes, 
appears  to  return  to  the  same  shelter  after  seeking  its  food.  Leaving  such 
"flitters"  on  one  side,  I  think  we  might  look  upon  thirty  to  forty  yards  as  a 
maximum  "flight"  for  such  a  snail  and  regard  seven  or  eight  such  flights  between 
its  egg  layings  as  on  the  average  an  exaggeration. 

We  might  therefore  take  Z  =  40  yards,  n  =  8,  and  an  average  during  life  of  one 
brood  a  year  as  being  quite  possible  approximations  in  the  case  of  some  snails. 

This  indicates  that  the  progress  across  a  boundary  into  unoccupied  country 
would  be  such  that  1  per  cent,  of  the  density  at  the  boundary  and,  therefore, 
possibly  ^  per  cent,  of  the  density  in  the  fully-occupied  country,  would  only  be 
reached  at  2061  yards  from  the  boundary  after  100  migrations.  In  other  words, 
such  a  species  would  only  progress  a  mile  or  two  at  most  in  a  century.  Such 
progress  would  hardly  be  noted  in  any  studies  hitherto  made  of  distribution ;  the 
limits  of  a  species  a  hundred  years  ago  were  certainly  not  closely  defined  to  a  mile 
or  two,  even  if  they  have  been  recently.  Of  course  there  are  many  other  ways  in 
which  a  slow  moving  species  can  be  transported  than  by  its  own  "  flights,"  and 
further  no  special  stress  is  laid  on  the  above  case,  but  a  study  of  Table  VI.  shows 
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that  the  advance  of  a  slow  scattering  species'*  may  be  comparatively  small.  The 
inference  can  accordingly  be  made  that  the  existing  boundaries  of  the  geographical 
distribution  of  certain  forms  of  animal  and  plant  life  which  are  not  marked  by 
natural  barriers,  and  which  do  not  correspond  to  obviously  changing  environ- 
mental conditions,  need  not  after  all  be  associated  with  subtle  physical  differences 
which  have  escaped  the  observation  of  the  naturalist.  The  species  may  be  pro- 
gressing into  an  unoccupied  area,  but  at  a  rate  hardly  observable  in  the  time  during 
which  accurate  distribution  observations  are  available.  If  this  view  be  correct 
we  should  expect  such  boundaries  with  no  apparent  environmental  change  in 
the  case  of  species  for  which  we  might  reasonably  predict  a  small  n  and 

Illustration  II.  I  have  endeavoured  to  apply  the  above  theory  to  the  im- 
migration of  mosquitoes  into  a  cleared  area.  We  will  suppose  in  the  present 
treatment  that  the  area  bounded  by  a  straight  line  (some  attempt  to  allow 
for  curvatm'e  of  the  boundary  will  be  considered  later)  has  been  cleared  but  is 
not  kept  sterile  to  the  species.  I  shall  speak  of  a  district  as  rendered  sterile 
to  a  species  when  it  is  made  impossible  for  it  to  breed  there,  and  kept  sterile 
when  the  breeding  possibilities  are  persistently  destroyed.  The  distinction  is  an 
important  one,  especially  in  the  mosquito  case.  For  in  the  latter  case  all  mos- 
quitoes are  immigrants,  and  in  the  former  case  we  have  not  only  immigrants, 
but  their  produce. 

Major  Ronald  Ross,  who  has  most  kindly  provided  me  with  information  as 
to  mosquito  habits,  makes  the  following  remarks  : 

(a)  That  the  number  of  mosquitoes  produced  varies  roughly  [ceteris  paribus) 
as  the  extent  of  surface  breeding 

(b)  That  the  breeding  area  can  be  taken  as  consisting  of  numerous  isolated 
small  pools  or  vessels  of  water  scattered  fairly  uniformly  over  the  country, 

(c)  That  the  feeding  places  (houses,  stables,  birds,  etc.)  may  be  taken  as 
scattered  pretty  uniformly  between  the  breeding  pools. 

(d)  That  abundance  or  scarcity  of  food  can  scarcely  influence  the  question 
much.  A  single  man  or  bird  will  yield  enough  food  for  many  mosquitoes,  and 
if  they  starve  it  is  not  because  the  food  is  not  there,  but  because  they  cannot 
reach  it.  They  are  therefore  not  likely  to  be  drawn  in  general  by  special  abun- 
dance of  food  in  any  special  direction.  Wind  tends  to  make  mosquitoes  "  sit 
tight,"  rather  than  allow  themselves  to  be  scattered. 

It  would  thus  appear  that  on  the  average  an  "  equi-swampous "  condition  of 

the  environment  and  random  "  flights "  of  the  mosquito  will  not  be  very  wide 

of  the  truth.    The  difficulty  is  to  form  some  estimate  of  n  and  I.    On  these 

points  again  Major  Ross  came  to  my  help,  but  naturally  the  statements  he  made 

were  with  great  reservation. 

*  Of  course  any  more  quickly  moving  species  that  depends  on  this  for  food  would  have  the  same 
boundary,  but  in  its  case  the  boundary  would  be  environmentally  defined. 

5—2 
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(a)  From  egg  to  egg  {i.e.  from  laying  of  eggs,  hatching,  larval  and  pupal 
stages,  to  laying  of  eggs  again)  takes  roughly  about  a  fortnight  in  hot  countries 
with  most  mosquitoes.  In  England,  gnats  may  have  only  one  generation  or  two 
in  a  summer,  but  in  the  tropics  they  may  go  on  breeding  throughout  the  year. 
In  cool  countries  the  egg  to  egg  cycle  may  be  prolonged  to  a  month  or  two. 
In  certain  very  hot  and  dry  countries,  breeding  may  be  checked  entirely  except 
during  the  rainy  season.  I  have  accordingly  taken  20,  10  and  5  breedings  to 
the  year  to  represent  roughly  these  conditions. 

(/8)  Major  Ross  distinguishes  between  "  minor  vicissitudes,"  which  an  insect 
makes  when  it  hovers  round  its  victim  or  mate,  and  "  major  vicissitudes "  which 
it  makes  when  it  passes  from  feeding  place  to  pool  for  egg  laying.  These  cor- 
respond to  my  "flitters"  and  "flights."  He  considers  that  they  go  back  to 
water  every  four  or  flve  days,  so  that  a  "  major  vicissitude "  occurs  every  two 
days  or  so.  We  might  therefore  take,  excluding  flitters,  the  average  number 
of  flights  to  be  six  or  seven.  Of  course  this  is  the  roughest  approximation,  but  still 
not  an  unreasonable  estimate  of  what  probably  takes  place  in  the  mosquito's  life. 

(y)  As  to  the  magnitude  of  /  we  have  less  definite  data.  Mosquitoes  of  a 
rare  kind  have  been  said  to  have  been  found  two  or  three  miles  from  their  breeding 
place.  Major  Ross  thinks  that  Anopheles  will  exceptionally,  when  no  houses 
are  near,  probably  travel  ^  mile  for  their  food,  or  perhaps  further,  but  he  supposes 
the  average  distance  scarcely  to  exceed  ^  mile,  and  it  may,  as  houses  and  suitable 
pools  often  abound  not  more  than  50  yards  apart,  be  not  greater,  perhaps,  than 
100  yards. 

I  have  accordingly  taken  100  yards  and  500  yards  as  likely  values  for  I,  and 
considering  1  per  cent,  of  the  boundary  value  of  the  mosquitoes'  density  as  a 
limit  to  their  existence  and  5  per  cent,  as  objectionable,  we  have  the  following 
table  : 

Table  VII.  Distances  from  the  Boundary  of  a  cleared  hut  not  sterile  area  at 
which  1  'per  cent,  and  5  per  cent,  of  the  houndai-y  density  of  Mosquitoes  will 
he  found  in  the  course  of  a  Year. 


Supposed  n 

umber  of  Breeding  Cycles  in  Year 

5 

10 

20 

6 

7 

6 

7 

6 

7 

Density  1  per  cent. 

^=  100 

998 

1078 

1411 

1524 

1995 

2155 

)I                  >)  !> 

=  500 

4990 

5390 

7055 

7620 

9975 

10775 

Density  5  per  cent. 

-  100 

1  759 

820 

1074 

1160 

1518 

1640 

>'         )>  )) 

=  500 

!  3745 

4100 

5370 

5800 

7590 

8200 
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The  distances  are  all  given  in  yards. 

Thus  we  see  that  the  least  of  these  distances  for  1  per  cent,  is  greater  than 
half  a  mile,  or,  if  an  area  be  cleared  but  not  rendered  sterile,  we  might  expect 
within  a  year  the  mosquitoes  to  reappear  within  half  a  mile  of  the  boundary, 
and  to  reach  an  objectionable  frequency  even  at  this  distance  for  most  of  the 
cases  considered. 

As  far  then  as  these  rough  numbers  can  be  taken  to  indicate  the  state  of 
affairs,  it  is  needful  not  only  to  clear  an  area  but  to  maintain  it  sterile.  The 
clearance  radius  may  be  only  ^  mile  and  is  hardly  likely  to  exceed  a  mile,  and 
the  above  results  only  mark  the  progress  of  immigration  in  the  course  of  one 
year  after  the  clearance.  Further  the  results  would  be  accentuated  if  the 
boundary  were  curved  or  an  approximately  circular  clearance  made. 

It  does  not  appear  to  me  that  any  substantial  difference  would  be  made  in  the 
main  result  by  reducing  w  to  3  or  4,  although  some  difference  would  occur  if  / 
were  reduced  to  20  or  30  yards. 

(10)  Problem  III.  To  determine  the  distribution  after  m  n-jiight  migrations 
starting  with  a  centre  of  population  Na. 

The  previous  two  problems  indicate  the  nature  of  the  general  solutions  to  which 
I  now  proceed.  I  shall  adopt  the  longer  process  of  proof  in  this  first  case  as  being 
the  more  suggestive. 

By  (xii)  and  (xlvi),  calling  the  operator  as  before  Qt,  we  have  for  the  distribution 
at  X,  Y  due  to  a  centre  at  the  origin  : 

A(^,J')=^ft(^,<3-*'^^+^-'>'")   (IV). 

Hence  the  distribution  at  [h,  k)  after  a  second  migration  of  n  flights  is 
,<^„  {h,  h)  =  i.L\  4^  (X,  Y)  f  '  ^  ,  dXd  Y. 

Call  the  Qt  in  this  Qf^  and  write  the  cr'  on  which  it  operates  cr/ ;  call  the  in 
[X,  Y),  Qf^  and  the  cr^  on  which  it  operates  a-^,  we  have  : 

A{h,k)='^-—  Q,^Q,l  -—,e  UXdY. 

{AtT)  j  -00  j  -00  (Ti  0^2 

The  integrations  can  be  performed  and  give  us 

^■)='^  0,,<?,.^|— .e-4«*^+*")/W-^.«   (i^ij. 

This  only  differs  from  ^c^,,  (A,  Y)  by  the  introduction  of  cr/  +  o-/  for  cr'  and  of  the 
factor  /mA^^^. 
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We  can  accordingly  repeat  the  process  as  often  as  we  like  and  we  have : 

where  t'^(T:  +  a;'+(T,'+ ...+o-J, 

and  after  the  operations  have  been  performed  we  are  to  put  all  the  o-'s  equal 
to  o-  or  t'  =  ma\    But  no  operator  Qt  affects  any       in  any  other  operator,  and 

operators  identical  in  form  and  we  may  write 


+  ...+{-iyN._,(a'Y^^  +  .... 

In  this  form  of  the  operator  we  can  now  write  at  once  o-'^  =  — and  call 

m 

the  expression  Q/^. 

Thus      <?,-  =  !+  iV.    )'         -  N.  (tj         +  N,  (tj 

+  ...  +  (-l)W„(Stj^  +  

1  AT-  ^  TIT 

where  iv ,  -  — ,  , ,  iv  „  =  —  v , , 

Ty-  _-^^?-^'8  +  i^(m  —  1)  I//         „  _wt/io  +  m  (m-  1)  i^^z^, 

^  ^^^i,  +  m  (m  -  1 )  1^,1^,  +  jm  (m  -  1)  (m  -  2)  t^/  ^   ^^.^^^ 


These  values  of  the  N's  rapidly  converge  and  their  values  are  given  in  Table  III. 
on  p.  21  of  this  paper  with  those  of  the  i^'s  for  a  few  values  of  n  and  m.  As 
we  have  seen  on  p.  32,  they  are  the  i^'s  obtained  by  using  values  of  7im,  for  n. 

We  now  have  the  general  solution  of  distribution  from  a  centre  : 

=  (/^A)™- Wa (Oo  +  iY,n,  +  N,n,  +  ...+  NA  +-•■)  • . •  (Ix). 

This  is  absolutely  identical  with  (xii),  except  that  the  constants  v  are  replaced 
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by  other  constants  N  of  known  value,  and  in  every  w-function  we  are  to  replace 
cr^  by  mar^  or  S^  that  is  to  say  a  uniform  stretch  in  the  ratio  of  Jm  to  I  is 
given  to  any  surface  z  =  o).,g  parallel  to  the  axes  of  x  and  y.  This  is  denoted  by 
writing  il^g  for  co^g. 

If  we  confine  our  attention  to  the  Rayleigh  part  of  the  solution — which  will  be 
more  and  more  nearly  exact  as  ni  increases,  for  the  N's  rapidly  decrease  in  value- 
then  we  have 

{h,  yt)  =(/xA)™-WaXl„  (Ixi), 

and  we  see  that  every  density  gradient  curve  for  the  successive  migrations  is  to 
be  obtained  by  a  stretch  from  the  first  migration  density  curve. 

In  general,  however,  this  result  is  not  absolutely  true  because  the  different 
components  of  the  true  solution  are  mixed  in  different  proportions,  the  A^'s  being 
functions  of  m.  We  see,  however,  that  the  stretching  rule  becomes  more  and 
more  accurate,  as  we  increase  either  the  number  of  flights  or  the  number  of 
migrations. 

(11)  Problem  IV.  To  find  the  form  of  the  general  solution  for  the  distribution 
into  surrounding  space  after  m  migrations  of  any  population  initially  spread 
uniformly  over  any  given  patch  with  density  N. 

The  density  at  h,  k,  after  m  migrations  due  to  a  centre  Ndxdy,  is  by  (Ivii) 
above 

=  /^^\m-i  Ndxdy  l_^_i{(^._A)2+(3/_^.)2|/22^ 

277 

To  give  the  patch  let  x  be  integrated  from  v-,  to  v^,  where  v^  and  v,^  will 
usually  be  functions  of  y,  and  then  let  y  be  integrated  from  to  u.^. 
We  find  : 

,j,(h.  ^.)=(^Ar-.^'|_;*'|;' i.-iK-'.)H(y-w^'d,^j,  (Mi). 

This  is  the  general  form  of  the  solution  when  the  population  spreads  from  a 
uniform  patch  into  non-sterile  surrounding  country. 

If  on  the  other  hand  we  want  the  distribution  after  m  migrations  starting 
with  a  cleared  patch,  which  is  not  kept  sterile,  we  have 

„A  {K  k)  =  (/.A)- W-  ,J^{h,k)   (Ixiii), 

for  the  whole  district  would  have  had  a  uniform  density  of  (jU.A)™"W  had  there 
been  no  clearance.  Hence 

„F^  (A.  k)  =  (i^Ay-WQ,"  (l  -  ^  J"  I"  i  e-i «-*>=-^(^-*«/^^<ixdy) . . .(Ixiv). 
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Hence  the  rule  :  If  the  sohition  can  be  found  for  a  single  migration,  replace  cr^ 
by  mo-',  and  each  v  by  the  proper  N,  multiply  by  the  factor  (/aA)™-\  and  the  solution 
for  m  migrations  is  deduced. 

It  will  thus  be  clear  that,  if  the  solution  can  in  any  case  be  found  for  one 
migration  fully,  we  can  at  once  extend  it  to  the  case  of  any  number  of  migrations, 
with  constant  fertility-survival  factor. 

(12)  Problem  V.  To  determine  the  distribution  after  a  first  migration  into  a 
cleared  rectangular  area. 

Let  the  area  be  the  rectangle  2a  x  26,  and  the  origin  be  taken  at  its  centre 
and  axes  of  x  and  y  parallel  respectively  to  the  sides  2a  and  26.  Then  the  density 
at  any  point  /i,  k,  after  a  single  migration  [h,  h)  is  given  by  the  principle  of 
the  last  problem  by 

F^{h,k)  =  N-F\{h,k)   (Ixv), 

where  F,^  {h,  h)  is  the  distribution  from  a  uniformly  occupied  rectangular  area 
into  surrounding  unoccupied  space. 

+a  r+b 


But  F^  {h,  k)  =  iV  {{x  -  Jif  +  (y  -  kf]  dx dy 


=  ■—  NQ(  —„  lie  dxdy 

277  a-'  J  ^aj  -b 

\  rv^^'^-'^'/'-'dx  X  re-^^^-'y^/'^'dy. 

Zir         o-  J  J  -b 


Let  Po  (c)  stand  for  the  probability  integral 

I  e~'^^^dx. 
J2tt  J  0 

Then:  1  I 

V27rcrj-a  s/27r  J -{a+?, 


(a-h)l(T        ^  2 

e"^'^  dx 

)/<r 


■  /— ,  ,       +  )e  dx 

v/27r\Jo         jo  / 


Thus : 


o-   /       "\  cr 


FAh,  k)  =  NQ,P^  {(^)  +^0  (^^)}  {Po  f^')  +  Pof-f')}...(lxvi). 


Now  consider  the  differentiation  of      ^-j  with  regard  to  o-^ 

~\p  ('^]\  =-A   A_  {''\-"^'''dx  =  _  1  !f  -1=  1  g-i w. 

dcr  \  'W/f     2ad(r  ^2tt  Jo  2(x^I2tt 
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Writing  (T^  =  t  as  before,  we  find 


Hence 


lj27r{-iyu(         ^,  1  ,  {s-l){s-2)  1.3 


2        r"2       \  2  1.2  2.2 

+  17273  f^-*)  2  .  2  .  2  +  ^^'7 ' 

'iH^]=\w^^^^^^   

the  expression  being  the  same  as  that  on  p.  32. 

Now  let  us  write  the  following  for  brevity  where  77  =  m/ct  : 
11       _i  2 

11  _i  2 

=  o  T^'?^  (21^, +  3i'„i/., (17)  +  ...+5K,,,i/;,(,_2) (77) +  ...), 
2  N/27r 

A  (^)  =  ^  7^    " +  6^B^.  (^)  +  . . .  +  ^-y—       (.-3)  (^)  +  •  •  • )  , 

A  (^7)  =  ^  7^    " (^^«  +  1  ^^^0^.  (>?)  +  •••+  ^^^7^2^^^  (^)  +  •••)••  ■  (1^^^)' 

and  so  on.  All  these  functions  are  directly  expressible  in  &j-functions  as 
on  p.  27. 

Further  let      P.  (v^)  =  (  -  1)^^^      P.      =  \  ^  ^e-^'\.,s-v,  (^)   (Ixx). 

Then  we  have,  if 

y^^  =  {a  —  h)l(T,  r).,  =  {a  +  h)/a,   e^={a  —  h)/o-,  €.,  =  {a  +  h)/a, 

{h,  Jc)  =  N  [{Po  iv.)  +  Po  iv.)}  {P.  (^.)  +  Po  i^.)}  +  {A  iv.)  +  A  iv.)}  {Po  (e.)  +  Po  (€.3)} 

+  {Po  (77,)  +  Po  (^.)}  {A  (e.)  +  A  (e.)}  +  {P.  (eO  +  A  (e.)}  {A  +  A  (^.)} 
+  . . .  +  {P,  (6,)  +  P.,  (e,)}  +  A,,  (.7,)}  +  . . .]   (Ixxi). 

The  i-functions  involve  the  rapidly  converging  ^-coefficients,  and  the  first  few 
terms  will  suffice  to  get  an  idea  of  the  distribution.  If  we  retain  only  the  Rayleigh 
terms  we  find  : 

K{h,  k)=Nll-{P,{v.)  +  F,{ri,)}{Po{e,)  +  P,{e,)}]   (Ixxii), 

which  can  be  ascertained  for  given  values  of  a,  b,  h,  k  and  a  from  the  ordinary 
tables  of  the  probability  integral. 

6 
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If  we  make  b  infinite,  then  P,{€,)  and  P,(e,)  =  0  for  s  >  0,  and  L,{e,)  and 
A(e,)  =  0,  P,{e,)  =  P,{e,)=^,  and  , 

f,{h,  k)  =  N{l-P,{yj,)-P,{r).;}-L,{7j,)-L,{r),)}   (Ixxiii). 

This  could  be  deduced  directly  from  (xlix)  and  it  represents  the  first  migration 
distribution  into  an  indefinitely  long  cleared  strip  or  belt.  This  is  a  result  of 
some  interest  as  it  might  approximately  apply  to  the  migration  into  a  zone  cleared 
by  a  flood  or  a  fire  of  certain  types  of  animal  or  vegetable  life. 

(13)  Problem  VI.  To  determine  the  distribution  after  m  migrations  into  a 
cleared  but  not  sterile  rectangular  area. 

By  the  general  proposition  on  p.  39  we  have  only  to  write  t  =  m(r  for  o-, 
and  the  iV's  for  the  i^'s  in  the  L's.   Let  us  put 

7j,'  =  {a-h)/t  =  r)JJm,  7).f  =  r),/Jm,   e/  =  €jjm,  e/  =  e,/y?n. 

Let 

and  so  forth,  then  we  have  for  the  full  solution  : 
^F,,  {h,      =(/>iAr-W[{Po(V)  +  ^o(V)}{A(€/)  +  Po(e/)} 

+  {Z/  M  +  L!  {-n:)]  {Po  (e/)  +  Po  (e/)}  +  {Po  M  +  Po  M]  [L!  (e/)  +  A'  (e/)} 

+  {P,  (e/)  +  P,  (e/)}  {L:  M  +  LI  (r;./)} 

+  . . .  +  {P.  (e/)  +  P,  (e/)}  {L',,,  M  +  L',^,         +  . . .]   (Ixxiv). 

The  terms  here  will  very  rapidly  converge  for  any  fairly  large  value  of  m, 
so  that  for  many  purposes  we  may  write  the  solution  : 

,„P,.  {h,  Jc)  =  (/.A)™- W  {Po  M  +  Po  M)  {Po  (e/)  +  Po  (e/)}   (Ixxv), 

which  can  be  found  at  once  from  the  usual  tables  of  the  probability  integral. 

Illustration  I.  A  rectangular  patch  2  miles  long  and  1  mile  broad  is  cleared  of 
mosquitoes,  but  not  retained  sterile.  What  would  be  the  central  density  at  the 
end  of  the  year?  Suppose  10  breeding  cycles  with  their  scatter  migrations,  each 
of  6  flights,  to  take  place  in  the  year.  Then  if  we  take  200  yards  as  a  possible 
round  value  for  the  flight  we  have  : 

w  =  10,  n  =  6,  1  =  200  yds.,     o-=  =  ^n/'=  120,000   or    o-  =  346-41  yds. 
a  =  880  yds.,    6=  1760  yds.,       =77,  =  2-540,  ei=e,  =5-081, 

^  =  VlOo-=  1095-44  yds.,        r),'  =  7},'=  '803,  e/  =  e./=  1-607. 

Hence  ,,F,{0,  0)  =  (/xA)''4Po  (-803)  Po(  1-607)  iV, 

or,  using  Sheppard's  Tables  : 

i„Pg  (0,  0)  =  (/xA)M  X  -2890  X  -4460A^, 
=  (^A)''x  -5156^^. 
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Thus  ,J,{0,  0)  =  {fxAy{l--5156)N 

We  see  accordingly  that  if  the  fertility  and  the  death-rate  were  the  same  in 
the  clearance  and  in  the  populated  district  outside,  the  density  at  the  centre 
of  the  cleared  patch  would  at  the  end  of  the  year  be  almost  50  per  cent,  of 
that  in  uncleared  country.  It  is  thus  obvious  that  clearance  can  be  of  small 
use,  unless  it  is  followed  by  permanent  preservation  of  sterility.  Even  if  one 
annual  clearance  were  made  it  is  very  unlikely — if  the  actual  values  of  the 
constants  are  at  all  near  those  assumed — that  the  mosquitoes  would  not  by  the 
9th  or  10th  breeding  cycle  within  the  year  before  the  annual  clearance  was 
repeated  have  reached  a  very  substantial  density  even  at  the  centre  of  the 
patch.  We  have  thus  an  additional  argument  in  favour  of  rendering  a  district 
not  only  sterile,  but  keeping  it  so.  In  such  a  case  since  and  v^,  xjj.,,  xjj^  are 
negative  we  shall  have  a  density  somewhat  less  than  : 

,f,{0,  0)=7V^{l-4Po(2-540)Po(5-08l)}  =  iV(l--9889)  about. 

Thus  :  1/"^  (0,  0)  =  'OliV"  approximately. 

It  follows  that  in  the  centre  of  such  a  rectangular  patch,  there  would  roughly 
be  only  about  1  mosquito  for  every  100  in  uncleared  country. 

But  while  this  shows  that  such  a  sterile  patch  would  be  a  great  improvement 
for  a  denizen  at  the  centre  it  is  well  to  enquire  what  happens  in  such  patches 
some  way  from  the  centre.    I  accordingly  add  the  following  illustration. 

Illustration  II.  A  square  area  of  one  mile  side  is  cleared  and  kept  permanently 
sterile.  What  will  be  the  density  at  the  centre  and  a  quarter  of  a  mile  from 
the  centre  on  the  same  assumption  as  before  ? 

Here  a  =  6  =  880  yds. 

At  the  centre  1^^  =  77.  =    =  €5  =  2 '5 4  and: 

J,  (0,  0)  =  iV [1  -  4  {Po  (2-54)}^]  ^N{1  -  (-9889)^}  =  -022^^ ; 
or,  we   find   one   mosquito  for   every  fifty  in    uncleared  country.     Taking  our 
quarter   of  a  mile  directly  towards  one   of  the  boundaries,  we  have  h  =■  440, 
^  =  0,  and  : 

'  -7i=l-27,    77,  =  3-81,    ei  =  e,  =  2-54. 
Thus  :  J,  (440,  Q)  =  N\l-  [P,  (1-27)  +  P,  (3-81)}  {2Po  (2-54)}] 

=  A^{1  -(-3980  +  -4999)  ('9889)}  =  '1 12A^. 

Thus  at  ^  mile  from  the  centre  (or  from  the  edge)  of  the  clearance,  the 
density  is  11  per  cent,  of  that  in  uncleared  country.  It  may  be  doubted  whether 
this  is  a  sufficient  reduction,  and,  supposing  the  above  assumptions  to  be  any- 
thing like  roughly  correct,  it  may  be  needful  to  render  more  than  a  square 
mile  permanently  sterile  to  protect  a  patch  of  one  square  half-mile. 

6—2 
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On  the  other  hand  a  cleared  but  not  permanently  sterile  square  mile  would 

after  a  year  have  a  density  at  the  same  point — I  mile  from  the  centre  of: 

(440,  0)  =  (/.A)W[1  -  {P,  (-402)  +  Po  (1  -205)}  {2P,  ('803)}]  =  {fxAf  -GdN, 
or  of  69  per  cent,  of  that  in  uncleared  country. 

Another  point  seems  of  some  interest.  What  is  the  density  at  the  boundary 
after  the  first  migration  ? 

At  the  middle  point  of  the  edge  it  is 

(880,  0)  =  N[l  -  {Po  (0)  +  Po  (5-08)}  {2Po  (2-54)}] 

=  iV^(l  - -5000  X -9889) 

=  -doeN. 

This  is  almost  the         of  an  indefinitely  long  straight  boundary. 

At  the  corner  it  is 

,/",(880,  880)  =  iV[l-{Po(0)  +  Po(5-08)p]=-75iV  nearly, 

or,  as  we  should  expect,  has  risen  much  beyond  the  value. 

There  is  no  difficulty  in  tracing  the  contour  lines  of  the  population  density 
in  this  case. 

If  we  consider  a  cycle  of  1 0  breedings  in  a  non-sterile  patch  we  have : 
(880,  0)  =  (AtA)W[l  -  {Po  (0)  +  Po  ( 1  -607)}  {2Po  (-808)}] 
=  •742iV^(/xA)^ 

and  ,„-re(880,  880)  =  (/.A)W[l -{Po(0)  +  Po(r607)}^] 

=  •801iV()LlA)^ 

Thus  if  the  patch  were  not  sterile,  the  effect  of  the  clearance  would  at  the 
boundary  after  the  lapse  of  a  year  be  marked  by  a  20  to  25  per  cent,  reduction. 
The  illustrations  I  have  given  are  of  course  dependent  on  the  values  of  the 
constants  selected.  Such  constants  have  at  present  been  little  studied,  and 
accordingly  small  weight  can  be  laid  on  the  actual  numerical  results.  But  the 
theory  appears  to  indicate  useful  lines  of  inquiry,  even  if  its  results  will  of 
course  need  to  be  controlled  everywhere  by  local  facts.  In  a  general  way  there 
can  be  little  doubt  that  a  theory  like  the  present  will  not  only  lead  to  a  more 
systematic  classification  of  local  facts  and  to  fuller  observation  of  the  habits  of 
local  species,  but  that  this  knowledge  itself  will  in  its  turn  test  the  applicability 
of  the  theory,  or  suggest  the  directions  in  which  it  may  need  modification. 

(14)  Problem  VII.  To  determme  the  distribution  after  a  first  migration  into 
a  cleared  circular  area. 

Let  the  radius  of  the  cleared  area  be  «.  Then  at  distance  c  from  the  centre, 
inside  or  outside  the  circle  of  radius  a,  the  distribution  /"„  (c)  is  given  by  : 
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fee  r^T 

f^{c)=N\     I        {&  +  1^-  'Ire  cos  9)  rdSdr  (Ixxvi) 

27r      jo  jo  o- 

Now      cos'"' Odd  =  0,  if  m  be  odd,  and  =  4      cob'' Odd 
jo  jo 

,  (26--l)(2.s-3)  ...  1  77    ^  2s! 

=  4  -^^  —  -T  -—         -  =  277 


2s  (2s -2)  ...  2     2  (2*s!)^' 
if  w  be  even  and  =2s. 


Hence  : 

fn{c)=NQ,  '—1-^  /  \-'-^'l^'rS  W^:)  dr  (Ixxvii). 


<t'  }a         0  yWJ  Vo-V(2^5!y 


=  M,,+,(a/o-)  (Ixxviii). 


3/2^+1  (a/cr)  is  thus  the  2s  +  1th  moment  of  the  '  tail '  of  a  normal  or  Gaussian 
curve  of  errors  (multiplied  by  J^tt)  about  its  axis.  Its  values  have  been  tabled 
for  s=  1,  2,  3  and  4. 

Thus  we  have  : 

/„  (0) =i^<?,.-i*s  (^y   ...(ixxix). 

But  it  is  easy  to  see  that : 
Accordingly  : 

(0) .1 5 .  ^-i,    ... .  J,,  (^^]  ...  (w, 

The  successive  differentiations  of  this  expression  with  regard  to  t  =  (r".  involved 
in  the  operator  Q^,  which  are  needful  if  we  wish  to  give  the  corrections  to 
the  Rayleigh  solution,  are  straightforward  but  extremely  laborious.  We  can 
throw  the  solution  into  other  forms. 

Write  :  Ci  =   cYa■^  £2  =  2  <^7 

then  we  hav^ : 

fn  (c)  =  iV^^.e -^^'-^>|  ^[  (1  +  e2  + 1;  +  . . .  +  ^) 


=  NQte-'^S^A'^  afe-'^dx  (Ixxxi). 
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Here  j  x^e'^'dx  is  the  incomplete  T-function  for  an  integer  value  of  5.  This 

can  be  found  fairly  easily  from  the  above  series,  or  may  be  determined  from  tables 
of  the  incomplete  F-function  which  it  is  hoped  may  be  shortly  published. 

00  2« 

Again  :  J,  {2ijz)  =  Sj-rr,, 

0  {SI) 

hence  we  have  : 

f,{c)  =  NQ,e-^^  I   J,{2iJ^)e-''dx  (Ixxxii), 

a  very  concise  form,  which  does  not,  however,  simplify  the  calculations.  Integrate 
by  parts  and  we  have  : 

0  cte, 

or : 


0  I        (weie,)*  . 
=  NQ,e- S  iJ^)  V,  {2iJ7J,)   (Ixxxiii). 


This  is  the  solution  in  Bessel's  functions,  and  inside  the  cleared  area,  where 
€2  is  greater  than  Cj,  would  give  fairly  good  results  if  tables  of  the  higher 
Bessel's  functions  for  imaginary  values  of  the  argument  were  available. 

We  can  also  express  the  solution  in  terms  of  w-functions  as  follows  : 


T-rr  .  r  /    X        f"^  /I  ^■'\'  -hr^la^  ^dv 


Then  f^(c)^NQ,S^^I.{a)E,(c). 

Now  E,  (r)  =  -i,  c  -  i'''''  (\  -T  =  2ir<r'S6, 


ip  ^  ip  > 


s !  \2  0-7  0 

the  6's  being  undetermined  constants,  for  dividing  by  the  exponential  factor  we 
have  an  integer  algebraic  expression  in  r'jcr-  on  both  sides.  Multiply  both  sides 
by  Xip'^^'^'         integrate  between  0  and  00 ,  p  being  =  or  <  s.    Then  : 

1   f"^  -I'Aia'i      /I  r' 


'oo  _i  2/  2      /I  r^\*  r °° 


27rcr- 
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h  1 


Thus 


E,  (r)  =  2770-'  |a)„  -  S(o,  +  -j2\y   ^  (3!}^   + 


.  (Ixxxiv) 


=  27r<T''U,{r)*,  say. 


Now  consider 


's        r  o:> 


0). 


,rdr 


d 


d  {a-y  L  27r 


1  -kr-^la'^ 


d' 


d  {a^y 


=  o).^  —  S(a„s_o_  (Ixxxv). 

We  can  now  express  1^  (r)  in  terms  of  w-functions. 

We  have  : 

rdr 


/,(r)=  s\E,{r) 


cr 


=  S\  2770-' 


sis-l)  s(s-l)(s-2) 


J  ^ 


Thus 
where : 


,^     o/      ,N  f  1)       s(s— — 2) 

=  .!277cr-(.+  l)|.o-^  +  -^co.-   ^  . 

:s!  277o-'(s+l)F,(r). 

K  (c)  =  NQAn^cT^S  {{s  +l)U,  (c)  V,  (a))  (Ixxxvi), 

0 

rr  /  \  •5(^—1)  — 2) 

„  ,  -  s  s(6'  — 1)  s(s—l)(s  —  2) 
n(^)  =  ^o-jT2!^.  +  -2!3r^^  3!4!  

a  result  which  allows  of  fairly  rapid  determination  from  tables  of  a^co^g. 

There  is,  perhaps,  less  difficulty  in  this  form  in  allowing  for  the  first  term 
or  two  of  the  operator  Q^,  for  Ug{r)  and  Vg{r)  can  be  at  once  differentiated  with 
regard  to  cr^,  but  even  then  the  final  result  has  considerable  complexity. 

*  This  result  involves  the  expression  of  any  power  of      in  ^-functions. 
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The  Rayleigh  solution  value  is  easily  found  by  putting  =  1  in  any  of  the 
forms  of  (Ixxix),  (Ixxx),  (Ixxxi),  (Ixxxiii)  or  (Ixxxvi). 

A  case  of  peculiar  interest  arises  when  c  =  0,  or  we  take  the  density  at  the 
centre  of  the  clearance.    In  this  instance  we  have  : 

Now 

and  e~^'''''''=2Tr(T'o>„ 

therefore  j^.(e-*">')  =  2..'  +  .  jg- .} 

=  2:7<r'{(-l)'c.„  +  «(-l)-X„-„} 
=  e-4"'"'(-l).{x„-«X.,.-.,}- 

Thus 

(0)  =  Ne - {1  -  2v,x.  +     -  3^«)  X.  +  {v,  -  4^^,)  Xs  +  (^s  -       Xs  +  •  •  •} 

=  2n(r'N{(o^  -  2v,(o.,  +  {u,  -  Sv,)    +  {v,  -  4v^) co^  +  {v^  -  5v,„)  Wg  +...).. . (Ixxxvii). 
We  are  also  able  to  consider  the  secondary  problem  : 

What  is  the  distribution  into  unoccupied  space  surrounding  a  uniformly 
occupied  circular  area  due  to  a  first  migration  f 

Let  the  radius  of  the  area  be  a  and  let  the  density  at  any  distance  c  be 
(c)  after  the  first  migration.    Then  clearly,  if  all  space  were  uniformly  filled, 
we  should  have  uniformity  after  the  first  migration,  or : 

F,.{c)  +  F^{c)  =  N, 

hence':  F,^{c)=N  —  Fn{c)  (Ixxxviii). 

The  solution  is  thus  thrown  back  on  the  solution  obtained  for  the  previous 
problem.    In  particular  at  the  centre  of  the  populated  area  we  have  : 

F,,{0)  =  N-F,,{Q)   (Ixxxix). 

We  are  thus  able  to  calculate  the  reduced  central  density  due  to  a  migration 
from  the  area  to  the  surrounding  unoccupied  district,  i.e.  the  effect  on  population 
of  the  spread  outwards  of  a  colony. 

(15)  Problem  VIII.  Indirect  solution  of  the  General  Problem  of  the 
Random  Walk. 

It  may  not  be  without  interest  to  put  on  record  the  distribution  density 
after  n  flights  in  the  case  of  a  cleared  circular  area,  if  it  be  expressed  in 
Kluyver's  manner  by  the  integral  of  a  Bessel's  function  product. 
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fa  ritr 

F^{c)=N  \  (c^  +  r'  -  'Ire  cos  6)  rcWdr, 

Jo  jo 


and 


=Nh- 


=  N 
=  N 

=  N 
=  N 


1  - 


1  - 


—  (   (     \   uJQ{uJc^  +  r^  —  2cr  cos  6)  Jo  {uiy''durd6dr\- , 
by  (iii), 

w  Jo  (■^^'^*)     ('^c)  {Jo  {ul  )Y  durdr  , 

by  Neumann's  Theorem  (see  p.  6) 


0/0 


u 


Jo  (w)  urd  (ur)  >  Jq  (uc)  du 


{J,  (ur)  ur]  J"o  (uc)  cZmI  , 
u       Jo  J 

by  the  theorem  cited  on  p.  7, 


1  - 


{Jo  {ui)r 


u 


urJ^  [ur]  \  Jo  {uc)  du 


1  —     {Jo  {ul)y^  aJ^  (ua)  Jo  (uc)  du 


Or,  writing  v  =  au,  we  have 

fn{c)-N 


1-1  -^.(^)'^o(^^)Uo(^7Jl- 


a 


a 


(xc). 


This  expression  is  concise.  The  integral  expresses  the  probability  that  if  an 
individual  start  from  the  origin  and  take  (n+l)  flights,  the  first  of  magnitude 
c  and  the  remaining  n  of  magnitude  /,  at  random,  he  will  find  himself  within 
a  distance  a  of  his  starting  point.  But  there  does  not  seem  any  convenient 
method  of  evaluating  the  integral.  Comparing  with  (Ixxxiii)  we  have  the  curious 
identity : 


.(xci). 


Write  c  =  l,  a  =  r  and  n  —  l  for  n,  then 


where  is  the  operator, 

d^ 
dn 


dmf 
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or,  by  (iv),  P„  (r),  the  chance  that  an  individual  taking  n  flights  from  a  centre 
should  be  found  within  a  distance  r  from  that  centre,  is  : 

PAr)  =  N{l  -(>....-<--'*-'"'!|(,i)V.(2i  ^-^^)}   (xcii). 

Since  'f'^'^'^'^^^^dr^^''^'^^^' 

we  have  here  the  complete  analytical  solution  in  known  functions — i.e.  the 
Bessel's  functions  with  imaginary  arguments — of  my  original  problem  of  the 
random  walk.  But  this  formal  solution  provides  no  better  method  for  shortly 
determining  the  dispersal  curves  than  that  already  indicated  in  these  pages. 

(16)  Problem  IX.  To  Jind  the  dist7'ibution  after  m  migrations  each  of  n 
flights,  there  heing  originally  a  circular  clearance  ivhich  is  not  kept  sterile. 

The  solution  is  found  by  writing  mcr^  for  a",  putting  the  iV's  for  the  i^'s 
in  which  becomes  Q^,  and  multiplying  by  the  factor  (ju,A)'"~^  assumed  to 
be  constant.  This  can  be  done  to  any  of  the  forms  (Ixxix) — (Ixxxiii),  or  (Ixxxvi). 
If  we  write : 

=  Ci/m  and      —  e^/w 


we 


find 


^1 


^F^{c)=N{i.^r-^Qre-'^S[J^,y  x^e-^dx^i   (xciii), 

or:  ^/^„(c)=iV(/xA)™-^^/"(^-^'  |  J,{2iJI^)  e'^'dx  (xciv). 


Or  again  : 


,Jr.{c)=N{|.^y^-'Qre-^'^'■'^^S^^^         e/.(2^•Ae,)   (xcv), 

{c)  =  N{i.^)-^Qr^■^'nt<r^l  (      (^)  n  (^)  (^•+1))  (^cvi). 

Of  these,  I  have  found  the  first  quite  as  convenient  as  any  other  to  obtain 
numerical  results  from.    I  shall  now  illustrate  the  circular  patch  formulae. 

Illustration  I.  A  circular  patch  ^  mile  radius  is  cleared  of  mosquitoes  but 
not  kept  sterile.  To  find  the  density  at  the  centre,  at  i  mile  from  the  centre, 
and  at  the  margin  after  ten  breeding  cycles. 

We  shall  suppose  as  before  1  =  200  yards,  n  =  6,  and  therefore 

1  a^ 

o-'=  120,000  square  yards.        ^'^2ma^^^  '^^'^^' 

The  second  term  in  Qf"^  will  be  of  the  order  ^  of  the  first  and  I  shall 
neglect  it.    Accordingly  the  solution  may  be  taken 


< 
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r.fn  (c)  =  e-        (^A)- W  (l  +    ( 1  +  6,)  +  |j  (l  +    +  ^J) 

The  successive  bracketted  terms  in  are 

1-3227,        1-3748,        1-3804,        r3809  and  1-3809, 
which  is  equal  to  e+'^  to  our  number  of  decimal  places.    Hence  we  may  put 

(c)  =  {fxAfNe-^''^'^^  |l  +    (e-  -  -0582)  +     {e''  -  -0061) 

+  |^,(e^^--0005)  +  S^Je^^| 

=  (/xA)We-("^'+^^'  (1  -e^^  -f  e<^'+^^>-  -0582ej-  -0030er-  -OOOle/) 
=  (/aA)W{1  -e-"^'  4-e-^^'+^^^(l  -•0582e,--0030ii'--0001e/)}. 

(0)  =  (/.A)W{e—  (1)}  =  (/.A)"Ar-724. 

We  can  test  the  accuracy  of  this  result  by  using  Equation  (Ixxvii)  which, 
if  we  put  v^  —  N^,  gives  : 

,,f,  (0)  =  {i^AfNe--  (1  -  2N,x.  +  . . .) 
and  X.  =  1  -    =  (/xA)We—  {^  +  '^+ 

=  (/.iA)'W-730. 

The  agreement  is  accordingly  good  enough  for  practical  purposes,  and  we 
may  say  that  within  a  year  the  mosquitoes  would  at  the  centre  of  the  patch 
have  a  density  73  per  cent,  of  what  they  would  have  in  uncleared  country. 

I  now  consider  the  density  at  a  quarter  of  a  mile  from  the  centre,  ei=-0807, 
and  using  the  above  formula  we  find  : 

(440)  =  {fjuAfNil  -  e--«'  +  e-"^"^^  x  '9953) 

=  {fjiAyN-75. 

or,  we  see  that  at  a  quarter  mile,  midway  between  centre  and  boundary  of 
the  patch,  the  density  is  only  2  per  cent,  more  than  at  the  centre. 

Finally,  at  the  boundary  itself,  ei= '3227  =  e.,, 

(880)  =  (it>tA)W(  1  -  e-''''  +  e-""'^  x  '9809) 

=  (/xA)W-79. 

Thus  the  cleared  patch  would  within  the  year  have  filled  up  with  a  population 
of  mosquitoes  varying  in  density  from  73  per  cent,  at  the  centre  to  about  80  per 
cent,  at  the  boundary,  or  the  clearance  without  permanent  sterility  would  have 
been  quite  ineffectual  with  the  assumed  values  of  the  constants. 

7—2 
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Illustratio7i  II.  Let  us  assume  precisely  the  same  conditions  as  in  the  previous 
illustration,  except  that  the  area  shall  be  supposed  sterile,  and  we  will  consider 
what  happens  at  the  end  of  the  first  migration. 

At  the  centre  we  have  by  Equation  (Ixxxvii) : 

,  F, (0)  =  Ne-^{l-  2v,x.  +     -  3^.„)  x.  +     -  ^v,)  x,,  +  ...}. 

But  -2i/,= -083,333,       x-^  (^■^)  =  ^  -  e:.  =  -  2-227,000, 

J^4-3j^„=  --032,407,    X4  (^2)  =  2  -  4e,  +  e/=  --494,471, 

v,-ij^,=  --005,498,    X8(e-')  =  6- 18e,+  9e/-e./=  8-031,303, 

-  bv,,  =  -000,082,        X8  (^2)  =  24  -  96e,  +  72e/  -  16e/  +  e/, 

e,  =  3-227,  =34-752,347. 

Hence :  (0)  =  iVe"^''^'^  (1  -  -185,583  +  -016,024  -  -044,156  +  -002,850) 

=  -03lA^. 

This  three  per  cent,  of  the  density  in  uncleared  area  might  possibly  prove 
a  trouble  and  on  our  assumptions  it  may  be  doubted  whether  the  half-mile 
radius  is  sufiicient.  If  we  take  the  first  term  only,  we  find  -040^^,  or  four 
per  cent.,  not  an  important  practical  difference. 

The  introduction  of  even  the  first  modifying  term  when  c  is  not  zero  appears 
to  lead  to  such  complexity  that  I  content  myself  with  calculating  the  approximate 
value  given  by  the  Rayleigh  solution  for  distances  of  ^  and  ^  mile  from  the 
centre  of  the  clearance.  In  this  case  e,  =  3-227,  e,  =  '807  and  3-227  respectively 
half-way  to  and  at  the  boundary.  I  proceed  just  as  before  and  deduce  the 
following  approximate  value  for  ife{c),  i.e. 

(c)  =  {fxAfN  {1  -  e-''  +  e-(^'+^^>  (1  -  20-9769ej  -  7-8851ei'  -  •2355e/ 

--0228e,^-  -00166/ - -0001  e^")}. 

Hence 

i/",  (440)= -179  (/LtA)W,  corresponding  to  ei  =  '807 

and 

i/e(880)  =  709  (|LtA)W,  corresponding  to  e,  =  3-227. 

Thus  the  density  at  :|;  of  a  mile  from  the  centre  of  the  cleared  patch  would 
be  some  18  per  cent,  of  the  density  in  uncleared  country.  In  other  words  on 
our  assumptions  a  clearance  of  one  mile  diameter,  if  kept  sterile,  would  hardly 
sufiice  to  keep  an  area  of  ^  mile  diameter  free  of  mosquitoes. 

Compared  with  a  straight  boundary,  where  the  density  falls  to  about  one 
half  that  of  uncleared  country  at  the  boundary,  we  see  that  the  bending  of 
the  boundary  has  a  most  marked  effect  in  its  neighbourhood,  the  curvature 
raising  the  boundary  density  from  about  50  to  71  per  cent,  of  the  uncleared 
density.  In  fact  the  density  is  almost  equal  to  the  75  per  cent,  in  the  boundary 
angle  of  a  square  clearance. 


A  MATHEMATICAL  THEORY  OF  RANDOM  MIGRATION  53 


The  differences  between  a  square  and  a  circular  patch  inscribed  in  it  are 
noteworthy,  indicating  the  marked  influence  of  the  area  at  the  angles.  Thus 
at  the  centre  we  have  only  2  per  cent,  as  against  3  per  cent.,  and  at  ^  mile 
from  the  centre  11  per  cent,  as  against  18  per  cent. 

As  far  as  the  above  numerical  investigations  are  to  be  looked  upon  as  anything 
but  illustrations  of  the  nature  of  the  calculations  requisite  to  apply  the  theory 
of  random  migration  to  the  mosquito  clearance  problem,  they  must  be  taken  : 

(i)  As  merely  an  incentive  to  further  study  of  the  manner  in  which  mosquitoes 
scatter  from  the  breeding  ponds.  It  would  seem  possible,  if  difficult,  to  experimentally 
test  this  by  in  some  way  marking  a  large  number  of  insects,  and  determining  the 
nature  and  extent  of  the  flight. 

(ii)  As  indicating  that  permanent  sterility  of  the  protection  belt  is  almost 
certainly  needful.  The  ^  to  3  per  cent,  of  mosquitoes  at  the  centre  of  the  clearance 
amounting  to  6  to  18  per  cent,  at  ^  mile  distance  may  or  may  not  be  serious, 
but  they  certainly  would  very  soon  be  if  they  were  able  to  breed. 

(iii)  As  showing  that  on  the  rough  numbers  taken,  that  a  clearance  belt 
of  probably  ^  mile  round  a  settlement  would  be  the  minimum  desirable  sterile  zone. 
But  it  is  quite  possible  that,  when  the  requisite  constants  are  better  known,  it 
will  be  found  that  smaller  belts  will  suffice.  It  is  possibly  rather  an  exaggerated 
view  to  suppose  a  mosquito  to  make  six  random  flights  of  200  yards  between 
breeding  spot  and  breeding  spot.  But  certainly  many  insects  I  have  noted  will 
fly  with  great  rapidity  in  one  flight  50,  100  or  200  yards,  and  these  flights  are 
quite  distinct  from  "  flitters." 

(17)  Conclusions.  The  present  memoir  suffers  of  course  from  all  the  defects  which 
must  accompany  a  first  attempt  to  develop  a  mathematical  theory  of  phenomena 
which  have  hitherto  not  been  studied  with  this  development  in  view.  The  theory 
itself  suggests  hypotheses  and  constants  which  have  never  yet  been  considered. 
How  far  with  a  broad  average  of  environment  in  relation  to  food  supply,  breeding 
places,  shelter,  foes,  etc.  is  the  spread  of  a  species  random  ?  Are  any  of  the 
geographical  limits  to  plant  or  insect  or  animal  life  non-environmental  and  in 
course  of  change  ?  If  so,  statistical  studies  of  the  density  gradients  of  such  species 
for  a  few  miles  either  side  of  the  supposed  boundary  would  form  most  interesting 
work  for  biometricians.  But,  apart  from  this  observational  work,  a  good  deal  of 
experimental  inquiry  might  be  usefully  attempted  with  regard  to  the  constants 
of  random  scatter  or  flight  in  the  cases  of  both  seeds  and  insects. 

On  the  theoretical  side  there  are  many  problems  left  untouched.  The  present 
memoir  has  only  opened  up  the  outskirts  of  a  very  big  field.  It  would  be  of 
value  to  investigate  the  number  of  terms  in  the  expansion  in  w-functions  requisite 
to  practically  reproduce  the  graphically  constructed  density  distributions  for 
migrations  of  3,  4  or  5  flights.    Our  expansion  to  6  terms  is  hardly  close  enough 
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for  practical  work  until  n  =  6  or  7.  Many  other  shapes  of  populated  or  of  cleared 
areas  would  provide  problems  of  some  interest,  especially  when  the  spread  of  the 
colony  was  limited  in  one  or  more  directions  by  environmental  barriers,  such  as  sea, 
river  or  mountain  range.  The  problem  of  sterile  areas  has  been  by  no  means 
exhausted,  for  in  suck  cases  I  have  only  dealt  with  a  result  of  the  first 
migration,  but  actually  there  will  be  a  second  and  later  migrations  in  which 
not  only  new  immigrants  will  appear  but  a  portion  of  the  first  immigrants  will 
be  emigrants  and  again  able  to  breed  when  they  reach  uncleared  country.  Our 
solution  thus  gives  only  a  minimum  limit  to  the  percentages  if  the  immigrants 
do  not  die  at  the  end  of  the  first  breeding  cycle.  Much  interest  attaches 
also  to  cases  in  which  the  fertility  and  the  death-rate  are  correlated  with  the 
density,  i.e.  fxA  is  not  to  be  considered  a  constant.  But  in  these  as  in  other 
problems  which  suggest  themselves,  a  further  preliminary  knowledge  of  some  of 
the  ecological  constants  suggested  by  the  present  enquiry  would  be  an  extremely 
valuable  guide  to  the  direction  that  research  should  take. 

On  the  purely  mathematical  side  the  problem  of  the  "  random  walk "  may 
now  be  considered  as  fairly  completely  solved.  The  distribution  curves  have  been 
determined  until  they  pass  into  an  analytical  solution  expressed  by  a  new  type 
of  function.  The  expansion  in  these  functions  shows  the  limits  to  the  accuracy 
of  Lord  Rayleigh's  solution  of  a  certain  allied  problem  in  the  theory  of  sound. 
But  the  oj-functions  which  have  arisen  in  the  enquiry  have  most  interesting 
properties,  and  have  led  me  to  a  whole  series  of  allied  functions  of  one  and 
two  variables  which  I  propose  to  discuss  on  another  occasion.  The  expansion 
in  w-functions  will  I  venture  to  think  be  found  ultimately  to  have  considerable 
importance  for  mathematical  physics,  especially  in  the  evaluation  of  certain 
definite  integrals  which  arise  there.  The  possibility  of  practically  carrying  out  such 
expansions  depends  on  the  determination  of  the  successive  moments  (and  products) 
of  the  original  function,  a  process  with  which  every  statistician  is  now  fairly 
familiar.  But  applied  to  definite  mathematical  functions  it  loses  the  disadvantage 
with  which  it  is  burdened  in  statistical  practice — the  high  relative  probable 
error  of  very  high  moments — and  becomes  closely  allied  to  the  process  of  deter- 
mining the  integral  of  the  product  of  any  function  and  a  Legendre's  coefiicient 
(or  solid  harmonic).  Should  the  generalised  &j-functions  prove,  as  I  anticipate, 
of  some  mathematical  interest,  it  will  be  another  illustration  of  how  the  need 
of  the  applied  mathematician  has  thrust  him,  almost  unawares,  into  the  path 
of  a  novel  functional  development. 
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XVI.    ON  FURTHER  METHODS  OF  DETERMINING  CORRELATION. 

By  Karl  Pearson,  F.R.S. 

(1)  Introductory.  The  object  of  the  present  paper  is  to  give  an  account  of  some 
new  methods  of  determining  correlation.  It  is  not  suggested  that  they  can  with 
advantage  replace  the  old  processes,  even  when  the  distribution  is  approximately- 
normal  ;  to  my  mind  the  methods  of  determining  the  correlation  ratio  and  the 
correlation  coefficient  {i)  and  r)  based  on  moments  and  product  moments  stand  fore- 
most for  the  information  they  give  and  its  weighable  accuracy.  At  the  same  time 
there  are  series  which  are  so  short,  or  cases  in  which  it  is  desirable  to  come  rapidly 
to  an  approximate  result  or  data  which  cannot  be  presented  in  a  form  suitable  for 
product- moment  working,  where  other  methods  are  not  only  reasonable,  but  necessary. 
To  such  cases  the  present  new  methods  apply.  I  have  termed  them  neiv  methods 
and  I  think  this  is  legitimate.  In  the  case  of  the  first  method,  I  have  not  seen  any 
hint  of  it  before.  In  the  case  of  what  I  term  grade  methods,  Dr  Spearman  has 
suggested  that  rank  in  a  series  should  be  the  character  correlated,  but  he  has  not 
taken  this  rank  correlation  as  merely  the  stepping  stone  by  which  to  reach  the  true 
correlation  of  the  variables  as  dependent  magnitudes,  and  further  in  the  discussion 
he  has  given  of  the  subject  he  has,  I  believe,  given  erroneous  formulae  and  made 
quite  incorrect  statements  as  to  the  magnitude  of  probable  errors. 

One  word  must  be  said  as  to  the  use  made  of  the  normal  distribution.  I  have 
used  it  here  as  on  many  other  occasions  as  a  means  of  suggesting  fitting  relations 
and  simple  formulae  for  correlation  constants.  This  does  not  necessarily  mean  (i)  that 
the  constants  reached  may  not  have  a  perfectly  definite  meaning  apart  from  normal 
distributions,  or  (ii)  that  the  formulae  obtained  may  not  hold  for  all  forms  of 
distribution  apart  from  normality.  As  an  illustration  of  the  first  case  I  cite  my 
mean  square  coefficient  of  contingency*.  This  is  a  perfectly  general  measure  of  the 
deviation  from  independent  probability  in  the  case  of  an  nxm  fold  table,  but  its 

*  On  the  Theory  of  Contingency.  "Drapers'  Eesearch  Memoirs,  Biometric  Series  i"  (Dulau  .&  Co., 
Soho  Square,  London. 
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actual  form  was  selected  so  that  it  would  agree  with  the  coefficient  of  correlation  in 
the  case  of  indefinitely  fine  grouping  and  normal  distribution.  As  an  illustration 
of  mj  second  point  I  take  the  formulae  given  by  me  for  the  influence  of  selection  on 
variation  and  correlation*.  These  formulae  were  originally  proved  for  normal  dis- 
tributions, but  for  a  number  of  years  past  the  proofs  given  in  my  lectures  have 
been  perfectly  general,  depending  only  on  a  more  comprehensive  definition  of  what 
we  are  to  understand  as  correlation  in  the  case  of  a  complex  of  variables. 
These  points  will  be  considered  in  the  present  treatment  of  correlation. 

(2)    On  Difference  Methods  of  finding  the  Coefficient  of  Correlation. 

Let  X  and  y  be  two  correlated  variables,  each  measured  from  their  means  Wj  and 
W2  respectively.    Then  \i  v  =  x  —  y,  and  a^,  a-y,  cr^,  denote  the  three  standard  deviations 

and  ^^2/  =  (o-^'  +  <-o-/)/(2a-^(ry)   (i). 

This  method  of  finding  r^y  has  long  been  in  use  as  an  alternative  method  to  the 
product-moment  method  f. 

It  involves  finding  the  mean-square  difference  of  the  values  of  the  pairs  of 
correlated  characters.  It  is  possible,  however,  to  find  r^y  from  about  one  half 
these  differences,  if  we  assume  the  distribution  to  be  normal. 

More  generally  I  proceed  as  follows.  Suppose  the  function  mx  —  ny  formed,  where 
m  and  n  are  at  present  indeterminate  positive  constants,  and  let  the  positive  values 
only  of  this  expression  be  taken  and  divided  by  the  total  frequency  N.  Then  it  will 
be  possible  to  determine  r  from  this  result. 

If  z  be  the  ordinate  of  the  surface,  then  : 


z  =  - 


(") 


27rcrj  cTo  \/ 1  — 

and  we  have  the  above  result  expressed  analytically: 

=  -J-  [\imx-ny)e'^^       (5 '  5 ^  S) dydx  ... (iii). 

The  limits  to  y  in  order  that  mx  —  ny  may  be  positive  are  ?/=  -  00  to  mxjn,  and 
the  limits  of  x  will  then  be  a;  =  co  to  —  00  . 

Put  y'  =  y/a-^       x'  =  xja^^ , 

then 


mcr-. 


Simx-ny)     1      1  f ''"'^  ,  „      -i  (^'- 


*  Phil.  Trans.  A,  Vol.  200,  pp.  1—66. 

t  For  example,  Phil.  Trans.  A,  Vol.  198,  p.  242,  and  often  elsewhere. 
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Write  <  =  — -x'-y', 

ncr., 

and  we  have : 


The  order  of  integration  can  now  be  changed  and  we  have  : 


where  if  e  =  mo-J{na.), 

1^  _  l-2re{l-e)-r'e  _  1  -  2re  + 

a' ~  {I  -  2r€  +  €')  (1  -r')'  l-r'  ' 

y  =  e  (1  -r)/{l-2r€  +  e). 
But  the  integral  with  regard  to  x'  is  -J 2tt  /8,  and 

S(mx  —  ny)      1     no-,  ,^ — 
Hence:  iy       =  2^  TTT? -^^'^  ^° ' 

Or,  for  the  positive  summation 

>S  (ma;  —  ny)     n V/  ( 1  —  r^)  Vw^  o-./  —  2rmn(r^  cr^  +  cti^ 


,(iv). 


A'  j2Tr        —  2rmcr,  (w,o-o  —  wcTj)  —  r^'rrfcTi 

This  general  value  does  not  appear  to  be  likely  to  be  of  much  service.  If  we  take 
m  =  n  =  l,  we  obtain  the  result  of  simply  summing  the  positive  differences  of  paired 
variates.    It  is  : 

S{x-y)_  a-./  (1  —  r^)\/cr/-2rcrjO-2  +  crj'  ,  ^ 

(v)  leads  to  an  equation  of  the  5th  order  to  find  r  and  again  does  not  appear  to 
be  likely  to  be  of  any  service.  The  variates  must  be  reduced  to  a  common  unit  before 
they  are  handled  if  we  are  to  make  (iv)  workable.  Such  a  unit  is  the  standard 
deviation. 

If  we  write  m  =  —     n  =  —  ,  we  have  at  once  : 

s(^  y 


a- J         / 1  -  ? 


A'^  V  TT 


[ 


s{--y- 


Thus  we  find:  r=l— tt] — ^~]y^~~^[   (vi). 
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(vi)  is  an  extremely  neat  formula  and  might  be  taken  as  the  definition  of  a  quantity 
measuring  correlation.  But  the  actual  determination  of  correlation  in  this  way, 
i.e.  the  reduction  of  each  variate  to  a  deviation  from  its  mean  measured  in  terms 
of  its  s.D.  as  unit,  would  probably  be  as  troublesome  as  using  the  product-moment 
method. 

One  special  case  occurs,  however,  in  which  the  above  formula  may  possibly  be  of 
good  service.  Suppose  the  two  variates  have  the  same  mean  =m  and  the  same 
S.  D.  =  a,  then  : 

{S{rii  +  x-m+,i)Y  ,  ... 

Or,  the  coefficient  of  correlation  is  the  result  of  subtracting  from  unity  tt  times  the 
square  of  the  mean  sum  of  the  positive  differences  of  paired  variates  divided  by  their 
common  standard  deviation. 

For  cases  in  which  both  variates  are  the  same,  brothers,  cousins  of  the  same  sex, 
homotypes,  etc.,  and  especially  for  some  cases  of  short  series,  the  method  may  be 
of  value. 

Illustration  I.  Resemblance  of  Length  of  Little  Finger  in  Male  Cousins.  I  take 
a  short  series  of  68  male  pairs  of  cousins.  The  average  value  of  the  length  measured 
on  the  little  finger  was  51*02  mm.  and  its  standard  deviation  2721  mm.  There  were 
33  positive  differences  of  finger  length  giving  S  {x  —  y)  =  87 '6  mm.    Hence  we  had  : 

Found  by  the  product-moment  method  the  answer  was  '287  ;  the  difference  is 
well  within  the  probable  error  of  the  latter  value.  The  process  of  taking  differences 
and  summing  was  considerably  shorter  than  finding  a  product  moment. 

Illustration  II.  Assortative  Mating  in  the  case  of  Paramecium.  I  take 
Dr  Pearl's  Table  AA  3  from  Vol.  v.  p.  295  of  Biometrika  for  the  lengths  of 
conjugating  Paramecia. 

I  choose  this  purposely  because  there  was  no  difficulty  above  about  the  male 
cousins ;  there  were  only  two  equalities,  the  actual  measurements  of  each  individual 
being  recorded.  But  in  an  ordinary  correlation  table  owing  to  the  method  of  grouping 
there  will  be  a  very  considerable  number  of  ties,  and  the  problem  arises  how  are  they 
to  be  distributed.  Clearly  one  half  of  them  will  be  excesses  and  one  half  defects, 
if  we  suppose  the  odds  against  an  actual  tie  in  measuring  to  any  degree  of  accuracy 
to  be  very  great.  Hence  we  may  say  that  half  the  diagonal  total  is  to  be  treated  as 
in  excess.  But  at  what  portion  of  the  base  unit  are  we  to  set  the  pair  apart  ?  If  the 
frequency  was  uniformly  distributed  over  the  diagonal  cells,  we  should  take  the  average 
interval  between  a  pair  to  be  \  the  base  unit.  But  the  material  is  almost  always 
clustered  inside  the  cell,  and  clearly  ^  is  too  much.  The  actual  value  to  be  taken  would 
depend  upon  the  value  of  the  correlation  and  the  size  of  the  base  unit.    In  fact  we 
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can  only  take  a  rough  approximation.  I  suggest  that  ^  will  be  found  to  work 
fairly  well.  Accordingly  we  take  of  the  contents  of  the  diagonal  cells,  multiplied 
by  the  base  unit.    The  whole  process  may  now  be  wantten  as  follows  : 


1 

TT 

1 

X 

9 

2 
O 

* 

0 

A             7             ft  Q 

1  n 

1  1 

X  1. 

A 
U 

1 

1 

1 

A 

i  1  1 

A 

u       -       -  — 

9 

1 

A 
U 

A 
U 

ITT' 

1 

u        -        —  — 

A. 

•i 
o 

A 

u 

A 
U 

1         —        —  — 

4 

14 

1    '  ' 

PL 

1 
I 

1 

1 

0       -       -  - 

30 

1^ 

q 

o 

1 
1 

i 

n 

u          —  — 

16 

5 

5 

0 

0 

0 

10 

16 

7 

1 

2 

0 

1 

16-3 

16 

4 

2 

0 

0 

0 

85 

4 

5 

3 

0 

0 

4 

76 

4 

0 

0 

0 

4 

54 

2 

0 

0 

l8 

16 

0 

0 

38 

20 

0 

85 

12 

98 

in-112*,        S{x-y)  = 

279-3 

X  10 

12 


r  =  1 


2793 


40  X  19-112 


=  -581. 


The  value  obtained  by  the  product-moment  method  is  '588  +  '022. 
The  correlation  Table  is  as  follows  : 

Length  of  First  Conjugant. 


o 


to 


160-9 

170-9 

180-9 

190-9 

200-9 

210-9 

220-9 

230-9 

240-9 

250-9 

260-9 

270-9 

280-9 

Totals 

160-9 

1 

1 

1 

1 

4 

170-9 

1 

2 

1 

1 

5 

180-9 

1 

1 

4 

3 

4 

1 

1 

15 

190-9 

1 

3 

4 

14 

7 

5 

1 

1 

36 

200-9 

4 

14 

30 

25 

9 

5 

1 

1 

89 

210-9 

1 

1 

7 

25 

22 

16 

5 

5 

82 

220-9 

1 

5 

9 

16 

10 

16 

7 

1 

2 

1 

68 

230-9 

1 

5 

5 

16 

16 

4 

2 

49 

240-9 

1 

1 

1 

5 

7 

4 

4 

5 

3 

31 

250-9 

1 

1 

2 

5 

4 

13 

260-9 

2 

3 

2 

7 

270-9 

0 

280  9 

1 

1 

Totals 

4 

5 

15 

36 

89 

8-2 

68 

49 

31 

13 

7 

0 

1 

400 

*  Pearl,  Ivc.  cit.  p.  226,  Table  II. 
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we  proceed  thus  :  Read  each  column  down  to  and  including  the  diagonal  cell,  and 
place  the  total  under  the  corresponding  differences  in  the  previous  scheme.  For 
example,  take  the  sixth  column;  1,0,  1,  7,  25,  22,  are  the  corresponding  frequencies, 
and  these  numbers  will  be  found,  sloping  from  the  column  marked  5,  i.e.  difference 
5x10,  diagonally  across  the  scheme.  In  this  manner  the  columns  of  the  table  can  be 
disposed  in  the  scheme  at  once.  The  scheme  columns  are  then  added  up  and 
multiplied  by  the  difference  at  the  top,  and,  if  multiplied  again  by  the  base  unit, 
in  this  case  10,  the  total  gives  S{x  —  y).  The  whole  can  be  done  with  very  great 
rapidity,  and  the  correlation  found  in  about  10  minutes  if  cr  be  known. 
As  other  comparisons  I  give  the  homotypic  results  : 

Difference  method  Product  method 
Monmouthshire  Ashes  (65,000)                     -432  -405  + -Oil 

Papaver  Rhoeas  (Quantocks)  (19,790)  -523  -533  + -013 

Ditto  (Chilterns'  Base)  (25,160)  -395  -400^012 

These  results  show  that  there  exists  quite  a  reasonable  amount  of  agreement  between 
the  two  methods,  and  the  difference  method  is  much  the  shorter  when  the  table 
contains  thousands  of  observations  as  in  these  cases.  At  the  same  time  too  much 
reliance  must  not  be  placed  upon  the  difference  method,  not  only  because  it  assumes 
normality  of  distribution  but  because  it  involves  a  somewhat  rough  method  of 
approximation  in  the  case  of  the  diagonal  cell. 

One  further  point  may  be  noted.  Suppose  that  rank  in  a  series  was  a  true 
character  which  could  be  dealt  with  by  a  difference  formula  like  the  above  then  r  the 
correlation  of  the  ranks  would  be  given  by 

{Six  —  y)] 
r=  1  -TT  i~~rr-^ 
[  No- 

Now  for  such  ranks  a"  =  ^^{N^—l),  therefore 

l2rr{S{x-y)y  .... 

Dr  Spearman  has  introduced  a  quantity  R  which  he  terms  a  "correlational 
coefficient*"  and  which  he  defines  without  any  special  justification  by: 


We  should  thus  have 


which  would  give  approximately  :     r  —  2R  —  R^ 


n  S(x-y)  /.  N 


i-r.^-^i^-ny-   (X), 


*  Journal  of  Psychology,  Vol.  ii.  p.  96. 
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This  is,  of  course,  not  true,  for  the  distribution  of  ranks  is  not  normal ;  the  exact 
formula  will  be  given  later ;  but  it  suffices  to  indicate  that  the  actual  distribution 
assumed  for  x  and  y  will  much  influence  the  relation  between  r  and  R.  Dr  Spearman 
from  trial  gives  the  empirical  formula 

r  =  sin^|i?^   (xi), 

which  is  also  incorrect.  But  the  above  relation  shows  that  we  are  not  a  priori 
compelled  to  suppose  that  r  merely  changes  its  sign,  not  its  numerical  value 
when  R  changes  sign. 

(3)  On  the  Correlation  of  Grades.  A  method  of  representing  frequency  has 
been  introduced  by  Francis  Galton  in  which  the  extent  of  variation  of  a  character  is 
expressed  by  the  position  of  the  individual  bearing  this  character  in  the  population. 
This  method  was  originally  spoken  of  as  that  of  percentiles  but  more  recently  as  that 
of  grades.  A  fundamental  feature  of  the  method  is  that  the  grade  is  looked  upon  as 
an  index  to  the  variate,  it  is  not  considered  as  in  itself  significant,  or  treated  as  an 
independent  character  of  the  individual.  In  order,  however,  to  pass  from  the  grade 
to  the  variate  it  is  absolutely  necessary  to  make  some  hypothesis  as  to  the  nature  of 
the  distribution.  The  hypothesis  hitherto  made  is  that  the  frequency  follows,  at 
least  fairly  closely,  the  normal  or  Gaussian  law.  On  this  assumption,  tables  of  the 
probability  integral  enable  us  to  pass  at  once  from  the  grade  to  the  magnitude  of  the 
variate,  and  vice  versa.  Quite  recently,  however,  Dr  Spearman  has  proposed  that 
rank  in  a  population  for  any  variate  should  be  considered  as  in  itself  the  quantitative 
measure  of  the  character,  and  he  proceeds  to  correlate  ranks  as  if  they  were  quanti- 
tative measures  of  character,  without  any  reference  to  the  true  value  of  the  variate. 
This  seems  to  me  a  retrograde  step  ;  hitherto  we.  have  dealt  with  grade  or  rank 
(I  will  distinguish  between  them  presently)  as  an  index  to  the  variate,  and  to  make 
rank  into  a  unit  itself  cannot  fail,  I  believe,  to  lead  to  grave  misconception.  Between 
mediocrities  the  unit  of  rank  treated  as  a  measure  of  a  variate  is  practically  zero, 
between  extreme  individuals,  it  is  very  large  indeed.  To  state  that  two  individuals 
differ  by  m  ranks  carries  no  meaning  at  all  unless  we  add,  (i)  the  size  of  the  population 
dealt  with,  (ii)  the  position  in  the  population  of  one  or  both  individuals,  and  (iii)  the 
nature  of  the  frequency  distribution  which  governs  the  population.  I  cannot  therefore 
look  upon  the  correlation  of  ranks  as  conveying  any  real  idea  of  the  correlation  of 
variates,  unless  we  have  a  means  of  passing  from  the  correlation  of  ranks  to  the  value 
of  the  correlation  of  the  variates,  i.e.  the  correlation  of  ranks  can  only  be  treated  as  a 
step  subsidiary  to  determining  the  true  variate  correlation. 

The  correlation  between  variates  can  be  made  to  change  widely  by  preserving 
the  same  system  of  ranks,  but  by  altering  the  nature  of  the  frequency  distribution. 
Thus  consider  the  system  : 
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Variates 


X 

—  2 

-1 

+  1 

+  2 

y 

_  2 

-1 

+  1 

+  2 

Ranks 


1 

2 

3 

4 

1 

2 

3 

4 

The  correlation  of  variates  is  perfect  and  the  correlation  of  ranks  is  also  perfect.  But 
we  may  also  have  : 


Variates 


X 

-2 

-1-9 

+  1-9 

+  2 

y 

-  2 

-•01 

+  •01 

+  2 

Ranks 


1 

2 

3 

4 

1 

2 

3 

4 

The  correlation  of  variates  is  now  "72,  but  the  correlation  of  ranks  remains  perfect 
and  would  indicate  nothing  of  this  great  difference.  I  think  that  it  is  safe  to  assert 
that  until  some  assumption  is  made,  at  least  as  to  the  approximate  nature  of  the 
distribution,  we  cannot  hope  to  avoid  misconceptions  if  we  use  the  method  of  ranks 
without  reference  to  the  rank  as  index  of  the  variate. 

In  such  a  case  there  can  hardly  be  a  doubt  that  the  best  method  is  first  to 
consider  to  what  results  normal  distribution  will  lead  us,  and  secondly  if  the  formulae 
found  turn  out  to  be  of  a  simple  character  to  adopt  these  as  the  basis  by  definition  of 
the  variate  correlation  constant  as  found  from  a  method  of  ranks.  This  will  be 
the  course  adopted  in  the  present  memoir. 

(4)  Let  there  be  a  population  of  N  members  and  let  these  be  under  investigation 
for  two  correlated  characters,  means  m^,  m^,  standard  deviations  a^,  cr^,  correlation  r. 
I  shall  suppose  normality  of  distribution.  Let  +  x,  m.^  +  y  he  the  deviations  of  the 
two  characters  in  any  individual.    Then  I  term  : 


9.  =  \N- 


N 


J2tt 
N 


o"!^  dx 


.(xii), 


the  X-  and  y-grades  of  the  variates  for  the  individual.  It  will  be  obvious  that  g^^  and 
g^  are  mathematical  functions  of  the  variates  and  that  accordingly  the  correlation 
between  them  determines  that  between  x  and  y,  or  vice  versd. 

Obviously  g^  and  g^  can  be  found  from  tables  of  the  probability  integral  as  soon  as 
X  and  y,  the  deviates,  are  known. 

I  term  rank  the  actual  position  in  order  of  an  individual  with  regard  to  any 
variate  in  a  given  series  obtained  by  measurement  or  observation.  If  be  the 
'rank'  of  an  individual  for  a  given  character  this  signifies  that  in  the  observed 
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population  there  are  Vi  —  ^  individuals  with  character  greater  than  x.  If  therefore  we 
were  to  identify  this  with  the  grade  we  should  have 

9.  =  ^i-h   (xiii), 

or  would  always  differ  from  a  whole  number  by  '5.  This,  of  course,  it  does  not, 
and  the  whole  problem  of  working  with  ranks  really  centres  on  the  degree  of 
approximation  which  is  made  when  we  proceed  from  ranks  to  grades  by  the  relation 
(xiii).  A  grade  determined  from  a  rank  and  not  from  a  variate  we  may  term  a 
spurious  grade ;  actually  the  real  grade  often  differs  by  several  units  from  the  spurious 
grade,  and  the  practical  problem  is :  To  what  extent  does  this  vitiate  the  use  of  ranks 
as  a  subsidiary  stage  to  the  determination  of  variate-correlation  ? 

I  shall  first  proceed  to  find  the  mean  and  standard  deviation  of  a  true  grade ; 
(xii)  shows  us  at  once  that     =  ^2  =       is  the  mean  value  of  the  grade. 

The  frequency  of  a  given  variate  lying  between  x  and  x  +  Sx 

=  -^= — e     '^i^  dx  =  dg^. 
But  the  frequency  of  the  variate  must  also  be  the  frequency  of  its  grade,  or  : 


=  3T  =  T2><^- 


{9i  -  9^Y 


N 


Hence  we  have  :  a-g^  =  cr^'  =  j^-V'   (xiv). 

Now  whereas  our  grades  are  a  continuous  series,  the  spurious  grades  or  ranks  are 
discontinuous  and  at  intervals  h=l.    (xiii)  shows  us  at  once  that 

^.  =  h  =  9.  +  i  =  h{^+^)- 
Further  o-^:  =  <  +  ^h\ 

the  latter  corresponding  to  the  Sheppard's  correction  by  which  we  pass  from  raw  to 
adjusted  moments. 

Thus  we  have  :  o-,;  =  cr,;  =  xV       -  1  )| 

(xv)  must  be  used  whenever  we  are  dealing  with  ranks  or  spurious  grades. 

Writing  i^=:g^-g,  and  ^  =  ^3-^2,  I  now  turn  to  the  determination  of  the  product 
moment  of  the  grades.    Let  us  put : 

-7=^6  +  -2;  


z  =  - 


then:  Pg,g,  =  j      \  h%zdxdy 

gives  the  product  moment  of  the  grades. 


+  00  /■+ 00 
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Differentiate  i^g^g^  with  regard  to  r  which  is  not  contained  in  either  or  4 ;  we 
have : 


But  I  have  elsewliere*  shown  that 


dz  _  dj^z 

dr      ^  ^  dxdy  ^xvii;. 

dp  r+'"  d^z 

Accordingly  :  -^  =  -.<^,  J  _  J  _^  i^,  ^^dxdy. 

Integrating  twice  bj  parts  and  noting  that  the  part  between  Hmits  vanishes  in 
both  cases,  we  have  : 

dr  J  _or,  j  _a>    dx  dy  ^ 

Substituting  for  dijdx  and  dijdy  and  writing  x  =  x'a-^  y  =  y'a^,  we  find : 

dr       47rVl -r' J -co  J -00  ^ 


V  \1  -rV      (1  -rj 
Now  if  /)j2  be  the  correlation  of  grades,  we  have  : 

Thus  remembering  (xiv) 


_  6  1 
dr     ttJ^  — 

or,  /)]2  =  - sin~^^r  +  constant, 

77 

Now  /3i2  and  r  must  vanish  together,  hence  the  constant  is  zero.  Accordingly  we 
have  : 

'77 


7- =  2  sin      p,,j  (xviii). 

This  remarkably  simple  formula  enables  us  to  determine  the  value  of  the  true 
variate  correlation  from  a  correlation  of  grades  on  the  assumption  of  the  normal  law ; 
or  if  grades  may  be  replaced  by  ranks,  a  knowledge  of  the  correlation  of  ranks  will 
give  us  the  correlation  of  the  actual  variates  behind  the  order  exhibited  in  the 
ranking.  The  important  idea  embodied  in  the  above  formula  is  the  basis  of  the 
present  memoir,  and  is  as  far  as  I  am  aware  wholly  new. 

*  rhil.  Trans.  A.  Vol.  195,  p.  25. 
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It  remains  for  us  to  consider  methods  of  finding  the  rank  or  grade  correlation  and 
the  probable  error  of  such  methods. 

(5)  A  convenient  method  of  finding  the  grade  correlation  is  that  of  formula  (i), 
p.  4,  we  have  at  once  : 

Or,  p,^=l    ixixj, 

if  we  use  true  grades, 

,  6S  (v,  —  vS'  ,  X 

=i-iv(i#--Ty   

if  we  use  ranks     and  u^. 

If  we  use  ranks  the  discovery  of  S  {v^  —  or  the  sum  of  the  squares  of  the 
differences  of  ranks  forms  a  very  easy  process  of  determining  p^.,,  due  regard  being 
paid  to  certain  points  to  be  dealt  with  in  the  illustrations  below.  Then  (xviii)  will 
give  the  variate  correlation. 

The  probable  error  of  and  of  r  found  in  this  way  will  be  given  in  another 
section. 

Since  the  determination  of  p^^  by  (xx)  is  algebraically  identical  with  finding  p^^  by 
the  product  moment,  and  such  product  moment  gives  the  least  probable  error  in  the 
determination  of  a  correlation  coefficient,  there  must  be  some  fallacy  in  a  statement 
which  has  been  propounded  among  the  psychologists  that  a  difference  method  of 
determining  the  correlation  will  give  p^^  with  about  |-  of  the  probable  error  of  the 
product  moment  method.    This  fallacy  will  be  considered  later. 

Meanwhile  it  is  of  interest  to  show  that  the  probable  error*  of 

is  of  the  form  : 

'67449 

P.E.  =  j==  {1-C,  pj  +  C,  pj  +  C,  pj  +  . . .) 

s/n  —  1 

where  c^,  c^,  Cg,...  are  undetermined  constants.  Or,  the  probable  error  of  p^.^  for  />i2  =  0, 
or  for  uncorrelated  ranks  is 

•67449/>y?i-l, 

i.e.  is  absolutely  identical  with  probable  error  of  a  coefficient  of  correlation  of  any  two 
uncorrelated  variables,  and  is  not  as  asserted  much  smaller. 

Since  for  ranks  cr^^  and      are  constant,  we  have  f  to  find  the  value  of 


„'  =  sj^)-{i(«  +  l)}j' 


and  being  independent,  in  order  to  reach  the  squared  standard  deviation  of 
Pn  for  pn  =  0. 

*  n  is  here  put  for      as  more  convenient  for  the  algebraic  work  which  follows 
t  I  owe  the  following  proof  to  the  kindness  of  my  friend  "Student." 
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There  being  no  correlation,  n  !  arrangements  of  this  product  occur  with  equal 
frequency.  Hence 

^n+iy  /n+r 


Next  any  v^v^  occurs  in  (w— 1)!  of  the  arrangements,  for  if  be  paired  with 
the  remaining  7i—l  pairs  may  be  arranged  in  {n—l)\  ways.  Thus 


2(^j5(,,)/„}  =  2(«-l)!(«-±-7^ 


n 


,  fn+  1 
=  2  {n) ! 


Further:  t  ('^)J  =  2i  + 2  ^(., 

where  i//,      are  different  from  z^i,  1^3. 

Now  i/iV/  occurs  in  (n  —  l) !  arrangements  ;  hence 

Next  v^v„ylv^  occurs  in  (n  — 2)  !  arrangements. 
Thus.  = 


where      and     may  now  take  all  values. 
Thus : 


^  1  ^^^5  ■  r  =  ^ — j  S  (I'll',}  S  (I'l    )   [Vi    +  ^2  J'l)  +  {vivi) 

144     ('^+ lN9^i'+3w'-8n-4} 


_ (^1-2)  !  1  /n  (n+l)Y _  ^  /w  (n+l)^  1)  (2n+ 1)  _^  /ri  (n+l)(2??  +  l) 
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Collecting  the  various  parts  we  find  : 

^. ^  (»-!)!  ("+  !)•  (2«+l)-  +  !  („  +  1).  (9„. ^  3„. _  3„  _ 

36  144      V        /  V  / 


o  ,  /n+iy  ,    ,  /n+l 


or,  after  reducing:,  =  — . 

&'  144 

Therefore  the  mean  value  of     is         ^ )  — 

144 

Now  the  probable  error  of  ^j.^ ,  for  uncorrelated  ranks  : 

= -67449^/0-,,  o-,,. 

=  -G7449/Vn-l   (xxi). 

It  thus  follows  that  if  the  value  of  be  not  two  or  three  times  the  expression 
(xxi),  there  is  no  significant  correlation  of  ranks,  and  therefore  no  significant  corre- 
lation of  the  corresponding  variates. 

(6)    On  the  Difference  Method  of  finding  the  Correlation  of  Grades. 

Exactly  as  in  the  first  section  of  this  paper  we  may  seek  the  correlation  of  grades 
by  means  of  the  sum  S{g^—g.,)  of  all  their  positive  difierences.  This  is  slightly 
shorter  than  finding  S  (g^—g^)',  but  only  very  slightly  so,  and  it  may  be  doubted 
whether  the  increased  rapidity  of  working  at  all  compensates  for  the  decreased 
accuracy  of  the  process.  Still  the  result  is  interesting  and  throws  considerable  light 
on  one  or  two  allied  points. 

Let  G  =  S{g^—  g^),  where  the  sum  S  is  for  all  a;-grades  which  are  greater  than 
corresponding  y-grades. 

Let  us  put  x  =  (T^x',  y  =  (r2y',  and  write 


111, 

z  =  1   z. 

277  V  1  —  O-jCTs 


^    r+co  rx' 

Then  :  ^  =  7^        \     Ux'  -jy)  ^  dydx 

V  ^77  J  —00  J  —CO 
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(2^)i 

^  

(277-)^  Jl-r" 


d 


dx'dy'  \Jl  -1^1 


,  I  dy'dad 


+  00 

—00  ^^  _ 

+  00  fx' 


I  •  _  •  \  — 


—  00  J     —  00  tti/y  I 


''y'^z']dy'dx'. 


clG 
dr 


(27r)2  Jl—r'j  -oo  j-oo 
Put  y'  =  x'  —  y";  then  after  rearranging: 

X  e    ^{l+rV     3  +  r ^  ;  +  (1  - r) (3+r) ^  /  dy"dx'. 

The  order  of  integration  can  now  be  transposed  and  if  X  be  written  for 

/    2  +  r  „ 

X  —~  y, 

3  +  r  ^  ' 


the  limits  of  X  will  also  be  —  oo  to  +  oo  .  Thus 

2 




dr~     (277)^(1 -r^)t  Jo 

But  if  c  have  any  value  : 


e  dXdy". 


r\-ioU-XcZX  =  0,    and     f^-^^^^^"  c/X  =  v/2^  -  . 

_/    -  00  J  -  00  C 


Hence 


1    2(l+r)  /H- 


,,"2 


(l_r7M27r)    3+r  V3  +  rJo 

N''  1  2  (1  (1  -r)  (3  +  r) 
"2^(l-r)2  (3  +  r)i  2 

iV^  1  1 


27r  v/(l -r)(3  +  ?-)  27ry4-(l+rf' 
Hence  integrating : 

iV'      _  1  +  r 
(t  =  constant  +  - — cos  ' — — -  . 

Ztt  2 

But  when  r  =  1 ,      must  be  zero  ;  therefore  the  constant  is  zero,  or  inverting  : 

Gr 

r  =  2  cos  27r 
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Or,  finally*:  r  =  2  cos  27r  (^'^^^)  -  1  (xxii). 

This  gives  us  the  correlation  of  two  variates  from  the  corresponding  grades  by  a 
difference  method. 

If  r  be  zero,  we  must  have  27r  -    js^i  ^    equal  to  60°  =  7r/3,  or  S {g^— g.^)  = 

when  there  is  no  correlation  of  variates.  This  is  easily  proved  directed,  for  in  this 
case : 

G  =  S{g^-g,)  =  12  Ij9.-9.)'^e'^^-''-''''^  dy'dx^ 


Njo  jo 


{9x-9,)dgydg^ 


1  (^g;-  . 


JVjo  2  6  • 

For  ranks  the  corresponding  expression  to  be  used  is  ^  (^Y-—  1),  or  we  have  : 

r  =  2cos27r('^?^^^Vl  (^^ii^)' 


As  before  the  truth  of  (xxii)  depends  on  the  approximation  to  normal  correlation. 

If  we  combine  (xx),  (xviii)  and  (xxiii)  we  have  the  relation  between  S  {v^^  —  v.^)'-  and 
S  (v^  —  v^)  which  holds  in  the  case  of  normal  correlation. 

Writing  R=l-S{v,- k,) /l {N' -  1 ),  we  have  : 

r  =  2  sin -/Di2  =  2  cos- (1  — -ff)  — 1   (xxiv). 

Dr  Spearman  gives -f-  the  relation  : 


/Oi2  =  sin  (^1   (xxv) 

(he  neither  connects  p^^_,  nor  R,  with  r)  as  apparently  an  empirical  relationship  and 
speaks  of  it  as  "  all  that  could  be  desired."    It  is  clearly  incompatible  with  normal 

*  The  relationship  of  (xxii)  to  (vii)  is  easily  seen  if  we  expand  the  cosine  as  far  as  the~'square  of  the 
angle.    We  have 

(vii)  would  have  given  us  1  instead  of  the  factor  1-04:72.  Thus  when  there  is  high  correlation,  or  *S' {gi-g.^ 
is  small,  we  see  that  the  difference  method  with  grades  leads  us  to  nearly  the  same  result,  as  the  assumption 
that  the  grades  themselves  form  a  normal  distribution.    This  suggests  that  Spearman  would  have  got 

much  better  results  for  his  "footrule"  for  measuring  correlation  had  he  taken  ^  =  1  -  3  -^-^—^  ( 

iv  —  1  \     Ncr  J 

instead  of  1  —  —^^2     >  foi"  this  value,  i.e.  1  -  (^~^^  i"-        notation,  would  have  been  almost  the  true 

variate  correlation  r. 

t  Journal  of  Psychology,  Vol.  11.  p.  102. 
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correlation,  which  at  any  rate  is  a  fairly  good  guide  for  general  relations  of  this  sort 
in  the  theory  of  frequency.  Table  I.  gives  the  values  of  r  and  R  for  each  '05  for  p^. 
Table  II.  gives  the  values  of  r  and  p^^  for  each  '05  of  R,  and  in  the  last  column  the 
value  of      which  would  arise  if  (xxv)  were  correct. 

Table  I.    Cori'elation  of  Variates  from  Mean  Square 
Difference  of  Grades. 


Pl2 

r 

K 

Pl2 

>■ 

R 

•00 

•000 

•000 

•50 

■518 

■323 

■05 

•052 

•029 

•55 

•568 

■361 

•10 

•105 

•059 

•60 

■618 

■400 

•15 

•157 

•089 

•65 

■668 

■442 

•20 

•209 

•120 

•70 

■717 

■486 

•25 

•261 

•152 

•75 

•765 

■533 

•30 

•31.3 

•184 

•80 

•813 

■584 

•35 

•364 

•217 

•85 

•861 

■644 

■40 

•416 

•251 

•90 

■908 

•709 

•45 

•467 

•286 

•95 

■954 

•796 

■50 

•518. 

•323 

1^00 

rooo 

POOO 

Table  II.    Correlation  of  Variates  f'om  Difference  of  Grades. 


R 

/• 

P,2 

(xxv) 

11 

/■ 

(xxv) 

•00 

•000 

•000 

•000 

•50 

•732 

•716 

■707 

•05 

■089 

•085 

■078 

•55 

•782 

•767 

■760 

•10 

■176 

•168 

■156 

•60 

•827 

•814 

■809 

•15 

■259 

•248 

•233 

•65 

•867 

•856 

•853 

•20 

■338 

•324 

•309 

•70 

•902 

•894 

•891 

•25 

•414 

•398 

•383 

•75 

•932 

•926 

•924 

•30 

•486 

•469 

•454 

•80 

•956 

•952 

•951 

•35 

•554 

•536 

•522 

•85 

•975 

•973 

•972 

•40 

■618 

•600 

•587  1 

•90 

•989 

•988 

•988 

•45 

■677 

•660 

•649 

•95 

•997 

•997 

•997 

•50 

■732 

•716 

•707 

1-00 

rooo 

1-000 

hOOO 

Now  these  Tables  bring  out  several  interesting  facts.  The  first  is  the  remarkable 
closeness  between  the  correlation  of  the  grades  and  the  true  correlation  of  the 
variates,  if  we  suppose  the  system  normal.  The  maximum  difference  as  shown  by 
Table  L  is  '018  and  actually  the  maximum  of  r  —  p^,  occurs  when  /Oi2=*5756  and  is 
then  -0180.  Thus,  the  difference  will  often  be  of  the  order  of  the  probable  error. 
The  formula  (xviii)  is  so  simple,  that  we  can  always  deduce  the  variate  correlation  at 
once  from  the  grade  correlation.  I  propose  to  define  r  as  given  by  (xviii)  as  the 
grade-variate  correlation.  Whenever  the  system  is  normal,  or  approximately  normal, 
this  will  agree  with  the  true  variate  correlation  closely.    Next  Table  1.  shows  us  that 
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equal  differences  of  give  almost  equal  differences  of  r,  i.e.  the  differences  only- 
range  from  "052  to  '046  of  r  for  differences  of  "050  of  p^^.  On  the  other  hand  the 
differences  of  r  for  equal  differences  "050  of  R  vary  from  "089  to  "003,  or  second 
differences  become  of  importance.  Clearly  for  high  values  of  R,  r  will  be  found  much 
more  closely  than  for  low  values. 

If  EJ  be  the  error  in  r  due  to  an  error  E^^^  in  p^^,  and  EJ'  be  the  error  due  to  an 
error  E^  in  R,  we  have : 

E;  =  ^cos-p,,xE^^^, 

E;'  =  ~sml{l-R)xE^ 

if  we  use  differentials.  For  the  special  case  of  pjo  =  R  =  0,  we  have  seen  that  the 
probable  error  of  p^=-67U9/'J  n—l;  it  will  be  seen  later  that  the  probable  error  of 
R  is  •4266/\/n— 1  nearly,  and  if  E,.  be  the  probable  error  of  r  =  0,  as  found  in  the 
ordinary  product  moment  way,  we  have  : 

•6745    77  -6745    2tt  JS  '4266 


E  •  E'  •  E" 


"  Vn  -  1  '  3  A  -  1  '  3    2  A  -  1 
•6745     ^7063  -7738 


.(xxvi). 


Jn—\    J  n—l    J  n  —  l 

Thus  we  see  that,  contrary  to  what  has  been  asserted,  the  accuracy  of  the  new 
methods — when  they  are  measured  by  the  determination  of  the  true  correlation — 
are  less  than  the  old  product  moment  method.  In  particular  it  requires  about 
30  per  cent,  more  observations  by  the  R  method  to  obtain  r  with  the  same  degree  of 
certainty,  when  r  =  0. 

At  present  we  do  not  know  the  R  factor  term  in  Ej^,  when  R  differs  from  zero, 
and  accordingly  cannot  test  E^,  EJ  and  EJ'  at  other  values  of  R  or  p^.,  but  I  have 
little  doubt  of  the  general  truth  of  the  result  that  E,.  is  at  all  vahies  as  well  as  for 
r  =  0,  sensibly  less  than  EJ  and  still  less  than  EJ'. 

(7)    Remarks  on  the  Probable  Error  of  R. 

The  probable  error  of  a  quantity  in  which  the  limits  of  the  summation  vary  as  we 
make  random  variations  in  the  constants  is  always  a  troublesome  matter,  and  I  have 
not  yet  succeeded  in  evaluating  the  probable  error  of  S[g^—g.^  when  gx>g.  for  any 
value  of  r. 

Spearman  has  investigated  the  probable  error  of  the  corresponding  expression  for 
ranks,  >S(^^  — j^.^),  ivhen  the7'e  is  no  correlation  betiveen  the  ranks.  He  finds  that  for 
n  observations  the  probable  error  of  R  may  be  taken  as  '43/j7i,  and  from  this  result 
he  has  drawn  rather  sweeping  conclusions  as  that :  "twenty  cases  treated  in  one  of 
the  ways  described  furnish  as  much  certitude  as  180  in  another  more  usual  way" ;  or 
that :  "a  probable  error  may  at  present  be  admitted  without  much  hesitation  up  to 
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0'05  ;  so  that  by  adopting  the  method  of  calculation  recommended,  two  to  three 
dozen  subjects  would  be  sufficient  for  most  purposes'*."  Now  these  statements  seem 
to  me  not  without  grave  danger,  and  accordingly  it  is  well  to  see  where  the  error  has 
crept  in. 

Spearman  gives  the  value  '4266/;^^,  but  it  should  be  '4266/n/>i  — If,  and  accord- 
ingly since  we  have  seen  that  the  probable  error  of  p^.,  for  py2  =  0,  is  •6745/\/n—  1,  the 
probable  error  of  R  would  only  be  about  |  of  the  probable  error  of  p,,,  and  upon  this 
Spearman's  statements  are  based. 

Now  the  probable  error  of  any  quantity  is  conventionally  '67449  x  standard 
deviation/N/?i  —  1,  and  accordingly  for  the  same  number  of  observations  the  probable 
error  is  less  when  the  standard  deviation  is  less.  But  there  would  be  no  meaning  in 
asserting  that  the  mean  of  20  metacarpal  bones  could  be  found  with  much  more  exacti- 
tude than  the  mean  of  20  humeri,  because  the  latter  being  a  larger  bone  had  a  greater 
variability.  We  must  either  measure  the  same  quantity  by  different  processes,  or 
else  be  at  any  rate  certain  that  our  quantities  are  alike  in  character  and  function 
before  we  compare  their  probable  errors.  The  probable  error  of  is  certainly  less 
than  that  of  x.  Now  p^.,  is  a  true  correlation  and  ranges  from  +  1  through  0  to  —  1 
with  a  symmetrical  distribution  about  0,  if  we  take  the  case  of  a  random  distribution 
of  ranks.  The  quantity  H  presents  nothing  of  this  nature  at  all ;  random  distribution 
of  ranks  does  not  give  a  symmetrical  distribution  for  R,  its  range  is  not  from  + 1 
to  —  1 ,  and  there  are  certain  values  it  can  never  take.  In  order  to  bring  out  these 
points  I  take  the  following  table  for  R  negative. 

,   Table  III.    Negative  Correlation  of  Variates  from  Difference  of  Grades. 


R 

r 

Pj2 

-■05 

-  -092 

-  -088 

-  -10 

-  187 

-  -178 

-  15 

-  -283 

-  -271 

-•20 

-  ^382 

-  -367 

-•25 

-  -482 

-  -465 

-  -30 

-  -584 

-  -566 

-•35 

-  -687 

-  -670 

-  -40 

-  -791 

-  -777 

-  -45 

-  -895 

-  -886 

-•50 

-  1-000 

-  1-000 

N.B.  It  will  be  observed  that  when  H  is 
negative,  the  true  variate  correlation  is  almost 
double  the  magnitude  of  R,  while  if  E  be 
positive  (Table  II.)  r  is  larger  than  E  but  not 
to  this  exaggerated  extent.  It  will  be  clear 
that  no  estimate  of  the  real  correlation  can  be 
based  on  B,  if  it  does  not  allow  for  this  ex- 
aggeration. 


*  American  Journal  of  Psychology,  Vol.  xv.  pp.  100,  101.  For  the  proof  of  the  probable  error  cited 
see  :  British  Journal  of  Psychology,  Vol.  ii.  pp.  105-8. 

t  Spearman's  result  at  bottom  of  p.  108  may  be  written  -j^ — — ^ — ^  ,  or  neglecting  terms  in 
not  ~  as  he  does,  this  gives  ■4:266/ Jn- I  as  we  should  anticipate. 
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Thus  we  see  that  while  r  and  p,^,  run  from  —  O'O  to  —  TO,  R  only  runs  from  —  0"0 
to  --50. 

In  order  to  obtain  his  probable  error  for  R  Spearman  takes  every  random  arrange- 
ment of  ranks  and  v.,  for  which  is  greater  than  v,.  He  has  neglected  to  observe 
that  when  he  does  this  his  R  will  become  negative,  but  that  it  will  not  range 
from  —  I'O  to  +  I'O.  For  example,  I  take  the  following  system  of  ranks  for  (2m  +  1) 
individuals : 


1 

2 

3 

4 

2m  +  1 

2m  +  1 

■2»i 

2ot-  1 

2  'm  -  2 

1 

This  gives : 


But 

Therefore  : 


S  {v,  -  V.;}'  =  2  {{2my  +  {2m  -  2)'  +  {2m  -  4)H  . . .  +  2'} 

=  8m  {m+l)  {2m +1)/ 6, 

_^S{v,-v.^'  _         Sm  {m+l)  {2m  +  I) 
•'•  N{N'^^l)  ~      {2m  +  I)  {{2m  +  1)^-1)'  ~ 

S{p^-v.^  =  2  +  4:  +  ...+{2m-i)  +  {2m  -  2)  +  2m  =  m  {m  +  l). 

6m  {m+  1) 


—  "5   (xxvii). 


N'-l  (2to+1)'-1 
Accordingly  when  the  correlation  is  negative  and  perfect,  the  number  of  observations 
being  odd,  R  will  never  take  the  value  —  1,  but  no  greater  value  than  —  "5  ;  whereas 
if  we  reckon  our  second  ranks  in  the  negative  direction  R  will  equal  +  1 . 

Here  the  Spearman  formula  (xxv)  leads  to  the  absurd  result  pj„  =  —1  /J2,  instead 
of  —1.    On  the  other  hand  my  formulae  (xxiv)  for  p^„= —1  and  R=—  5  give 
absolutely  the  correct  value  r=  —  I  for  the  variate  correlation. 
Again  take  N  even  =  2m  and  consider  the  system  : 


"1  = 

1 

2 

3 

4 

2n 

v.,  ~ 

2m -\ 

2/U-2 

2?«-3 

1 

We  find : 

S{ 

^2^m- 

■  \)"  +  {2m 

-3)'+(2» 

=  —  (4m--  1), 


and  this  gives 


^  2m  ,  ,  s 
6— (4»r-l) 


-  1. 


Again 


2m  (4«i-  —  L) 

{v^  -  K,)  =  1  +  3  +  5  +  . . .  +  (2m  -  5)  +  {2m  -  3)  +  (2/>i  -  1) 
=  m-. 
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TT  Ti    1  2m'  +1  f         3  1 

Hence  :  K=\—  - —  =  —  =  _  -5  J  i  j  I  /■yvv;i*i*\ 

Arrv-X        Anf-\         ^Y^N'-\]  ixxviiij. 

For:  i\^=4,      R=--mQ;     N=\0,      1{= --bib, 

N=W,    B=-'50i;     iV^=100,  R=--500. 

Or,  again,  the  limit  --5  is  rapidly  reached  as  the  number  of  observations 
increases.  In  fact  solely  for  the  simple  case  of  hvo  observations  is  it  possible  for  R 
to  reach  —  1. 

If  it  be  objected  to  (xxiv)  that  it  would  now  give  for  values  of  R,  greater  than 
-  -5  values  of  the  variate  correlation  greater  than  —  1  (  =  — 1*09  at  a  maximum  for 
N=i),  this  is  overlooking  the  point  that  (xxiv)  is  deduced  from  (xxii)  by  replacing 
true  grades  by  spurious  grades  or  ranks,  and  that  if  we  retain  (xxii)  then 

and  r  =  —  1  as  it  should  do. 

We  have  now  reached  I  think  the  basis  of  Spearman's  apparent  paradox.  While 
the  variation  of  the  true  rank  correlation  p,,  lies  between  +  1  and  - 1  and  has 
•67U9/JN-1  for  its  probable  error,  the  value  of  R  only  ranges  between  + 1 
and  —  '5,  and  may  well  have  a  less  value  for  its  probable  error. 

Now  Spearman  tells  us  that  large  negative  values  of  his  R  should  be  avoided*. 
There  is  no  necessity  whatever  for  avoiding  them  if  we  are  seeking  the  variate 
correlation  by  the  formula  given  in  this  memoir.  But  if  we  are  seeking  the  probable 
error  of  a  zero  quantity,  which  may  vary  on  either  side  of  zero  (and  in  this  case  the 
variation  is  not  symmetrical  about  zero),  we  cannot  neglect  the  distribution  of  random 
variations  below  zero.  If  Spearman  wishes  his  R  to  be  considered  always  positive, 
then  he  ought  to  have  found  the  probable  error  on  the  assumption  that  S(vi  —  v^) 
should  never  be  greater  than  ^  {N-  —I).  He  has  taken  a  quantity  which  ranges  from 
+  1  to  —  '5  and  compared  its  random  variations  with  one  which  ranges  from  +  1 
to  —  1  for  the  same  frequency.  If  he  had  restricted  his  attention  to  variations  of  R 
between  0  and  + 1  and  of  p^o  between  0  and  +  1  he  would  not  have  reached  the  same 
conclusion. 

But  there  is  a  further  very  serious  indictment  to  be  made  against  Spearman's  R. 
For  values  of  iV  fairly  small,  which  are  those  for  which  he  proposes  to  use  it,  R 
retains  a  constant  value  for  wide  variations  in  p^,^.  We  can  show  this  on  an 
exaggerated  scale  by  writing  down  the  possible  values  for  Spearman's  R  and  the  true 
rank  correlation  for  4  individuals  taken  witfi  random  ranks.    See  Table  on  p.  23. 

A  little  consideration  will  show  to  what  much  better  results  p^^  leads  us  than  R. 
R  in  fact  remains  constant  and  =  —  "2  while  p^„  passes  through  the  values  0,  —  '2,  —  "4 
and  —  "6  ;  or  r  can  take  values  from  0  to  —'62,  while  its  value  as  found  from  R 

*  Log.  cit.,  footnote,  p.  96. 
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remains  —  '38.  This  simple  illustration  of  how  the  real  rank  correlation  varies 
widely  while  Spearman's  coefficient  R  remains  constant  shows  how  unsuitable  the 
latter  is,  when  we  have  to  deal  with  small  series. 

Another  point  worth  noting  is  that,  if  we  take  the  positive  values  of  the 
correlation  only,  the  mean  value  of  R  is  "3818,  while  the  mean  value  of  the  corre- 
sponding pio's  is  '5454  ;  the  former  has  a  standard  deviation  of  '2622  and  the  latter 
of  •2573,  showing  that  we  are  not  justified  in  asserting  that  R  has  a  smaller  probable 
error  than       when  we  take  comparable  quantities. 

Spearman  appears  to  have  an  idea  that  R  is  really  a  coefficient  comparable  with 
/)j2,  and  he  attempts  to  get  over  some  difficulties  which  have  arisen,  by  telling  us  to 
reverse  one  series  of  ranks  when  R  comes  out  negative.  But  reversing  the  ranks 
does  not  aid  us  to  the  right  result.  Thus  if  the  ranks  in  the  12th  and  13th  column 
of  above  be  reversed,  we  find  that  R  still  remains  negative  and  of  the  same 
magnitude  —  '2.  In  fact  it  is  easy  to  write  down  a  system  of  ranks  which  give  a 
negative  R,  and  which  on  reversal  give  a  negative  R  six  or  seven  times  as  big.  The 
fact  is  simply  that  R  is  not  a  symmetrical  function  of  p^.,  and  reversal  of  ranks  does 
not  necessarily  reverse       in  sign.  . 

We  see  accordingly  (i)  that  the  total  range  of  R  is  only  about  |  that  of  p,,,,  and 
that  if  we  make  the  range  the  same  by  any  attempt  to  reverse  ranks,  the  Spearman 
method  of  calculating  the  probable  error  for  =  0  is  erroneous.  (ii)  That  the 
distribution  of  R  for  random  rankings  has  a  median  which  differs  from  zero,  is  very 
skew,  and  is  in  no  ways  comparable  with  that  for  py,. 

A  point  to  be  borne  in  mind  most  carefully  is  that  for  a  given  value  of  R,  p^,  the 
true  rank  correlation  may  take  a  great  variety  of  values.  It  is  only  when  (i)  the 
number  of  observations  is  fairly  considerable,  and  (ii)  we  assume  some  distribution  of 
associated  grades  such  as  that  of  normal  correlntion,  that  we  are  able  to  assert  that 
the  value  of  R  will  fix  p,.,,  but  such  a  relationship  as  that  connecting  p^„,  R  and  the 
variate  correlation  can  only  be  fixed,  as  in  this  memoir,  by  the  appeal  to  despised 
mathematical  analysis. 

Thus  the  advantages  claimed  by  Spearman  for  R,  namely  :  (a)  that  it  frees  the 
discussion  from  the  complexities  of  mathematical  analysis,  and  (b)  that  it  gives  a  less 
probable  error  than  more  usual  ways  of  approaching  the  subject,  are  seen  to  be 
illusory. 

The  difficulty  that  p^^  may  take  a  whole  series  of  values  for  a  single  value  of  R  is 
only  surmounted  if  we  define  the  character  of  our  frequency  distribution,  and  there 
is  no  doubt  that  we  shall  obtain  a  first  approximation  by  defining  it  as  normal. 
Secondly,  we  cannot  reverse  ranks  with  the  effect  Spearman  proposes,  and  if  we  could 
his  probable  error  of  R  for  R  =  0  would  be  erroneous.  Lastly,  if  we  do  not  reverse 
ranks,  then  the  probable  error  of  one  and  the  same  quantity,  the  variate  correlation,  is 
considerably  greater — for  the  only  case  yet  worked  out — i.e.  R  =  0,  when  found  by 
Spearman's  method,  than  when  found  by  the  well-known  method  of  squares  of 
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differences,  and  still  less  than  if  found  bj  the  product  of  the  variates  directly.  The 
squares  of  the  differences  of  ranks  can  be  taken  so  directly  and  quickly  from  a  table 
of  squares,  that  it  does  not  seem  to  me  that  the  slight  rapidity  gained  in  using  positive 
differences  of  ranks  is  of  any  weight  against  its  increased  inaccuracy  for  small  series, 
where  indeed  it  is  likely  to  be  chiefly  used. 

Further  no  two  rank  correlations  are  in  the  least  reliable  or  comparable  unless  we 
assume  that  the  frequency  distributions  are  of  the  same  general  character  (see  p.  9), 
and  this  general  character  will,  till  further  advance  be  made  in  the  theory  of  skew- 
correlation,  be  undoubtedly  that  provided  by  the  hypothesis  of  normal  distribution. 
On  this  assumption  Spearman's  suggestion  of  correlation  of  ranks  becomes  valid,  but 
not  as  he  supposes  as  a  Ding  an  sich,  but  only  as  a  means  of  passing  at  any  rate  to 
an  approximation  to  the  variate  correlation,  and  this  in  the  case  of  quantities  where 
it  is  easier  to  rank  individuals  than  to  measure  their  attributes  accurately. 

For  the  grounds  stated  in  this  section,  I  propose  to  use  as  a  rule  p^^  and  not  M  to 
find  r.  For  this-  reason  I  have  spent  my  energies  in  finding  the  probable  error  of  p^^ 
instead  of  seeking  that  of  H. 

(8)    On  the  Probable  Error  of  the  Correlation  of  Grades. 

The  following  investigation  is  admittedly  lengthy,  but  I  have  not  seen  my  way  to 
shorten  it,  and  the  main  point  is  to  reach  by  some  road  the  expression  for  the  probable 
error.  The  most  general  expression  for  the  probable  error  of  a  correlation  whatever 
be  the  distribution  is  to  be  found  from  '674492^  where*  : 

^.^r. i^  +  lj^i  ^\p^ ^\Pj* _  ih^ _ 

"     N\'p,,'     2^,,^„,     4p,/     ApJ    p,,p,^  p^^pj 

and  p,,.,  =  S  {n^j,  {x  -  xf  ( y  -  y^WN  (xxix). 

Now  in  our  case  x  and  y  are  to  be  the  grades     and  g^^  and  r  is  to  be 

which  we  will  write  for  this  investigation  p. 
We  have  at  once  : 


N  X      =  f "     -  gr  -j^  e  -  dx, 


_       iV    f  ^  _  1  ^ 
and:  9^  =  yx+  e   '  <ri' dx, 

V  ZTTCTi  J  o 

dg^=  e  ^  <^i^  dx.. 

V  27rcrj 

*  Mathematical  Contributions  to  the  Theory  of  Evolution,  xiv.  "Draper's  Research  Memoirs,"  Biometric 
Series  ii,  p.  20.    I  have  omitted  certain  terms  which  cancel. 
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Thus  :  jo^no  =     J    {g,- gT  dgi=  o,  if  m  be  odd, 

=  — ^  (i^r,  if  w  be  even. 
m+1  ' 

Thus  i?4o=i>o4=8V^''  and  p,„=_p„,  =  J2^iV^  as  before   (xxx). 

We  can  now  write  (xxix)  in  the  form  : 

= F  Ik  ^  ^^^^  ~  K  ^  ^4  

assuming  as  we  shall  show  in  the  sequel  (p.  30)  that  2^»=Pn-  Accordingly  we  have 
now  to  find  p.^.^  and  p^^. 

First  to  find  :  p,,  =  ^{S  {g,  -  g,f  {g,  -  g,)'], 

or  if  we  use  the  notation  of  p.  11, 

I   r+Qo  r+oo  _  _ 

I      h^h^zdxdy  (xxxii). 

Now  I  have  not  succeeded  in  integrating  this  expression,  although  I  have  spent 
much  time  over  it,  but  I  have  expanded  it  in  powers  of  the  variate  correlation  r. 


 1 

If  U= 


Vl  -r 

andja;/  andy^,/  are  the  same  as  on  p.  15,  we  can  write 


00   r  +  00 


in 


But*  U  =  s'-,v^w,,e-^-^''"+^"^ 


or,  V-^=^^^L\^ 


+  GO  ^  f  00 


where  j J  e'^^'^.^dx' =  \     j y'  e"      w,,dy'   (xxxiii). 

J  -  00  J  -ao 

If  n  be  odd,  v„  and  have  odd  powers  and  =  0,  hence  p,^  contains  only  even 
powers  of  r. 

First  :  ?o  =  r  °°  ;V e  "  ^o^x, 


where  we  may  drop  the  dashes  from  the  letters  now,  and       1.    Therefore  : 


+  00 


—  00 


*  Pearson:  Phil.  Trans.  A,  Vol.  195,  p.  3 


—  X  -7=   (xxxiv). 
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Now*:  ^«e-*^'=  -^-{v^_^e~^-'),  and  ^'«-i  =  ^^. 

f+*'         _i  2     r+'"    _  2 

Hence:  -       jxd{v^_,e  2-*)=  2         ^^e  ^  v„_iC?a? 

J    -00  J  -CC 

2f<"  _2  2   f'^'^  _a2  _2 

n  J  -(x,  "■J-Qo 

2    r  +  -  _3^2     ,  4    f  +  =^    .  , 

Butf  :  xVn  =  Vn+^  +  ^^  ,  thus: 

^n=--j_^^ne       dx-\-^^   j,e  ^  v.^^^dx  +  -j  ^^j,e  ^dx 
2  f+°"       _a-.2  ,  2 

Or:  9'n+3+ 2''^9'«='  ^'^  (i£c  =  y8„  say,   (xxxv). 

j  -00 

It  now  remains  to  find  /S„. 

J    -co  J  —  00 


J    —  CC  J    —  CO 


n-l  f+ 


+  ^  a. ,.2 


3 

=  -f  (n-l)  I  v^_,e-'^'^^'dx=-^{n-l)^„ 

J  -CD 

=  (  — f     (n— 1)  (w  — 3)  (n  —  5)  1  x  y8„ ,  n  being  of  course  even. 

But  ^^-^        v,e-"^'''dx  =  J^. 

J  -co 

Thus  we  have  the  reduction  formula : 

9n-,.  +  ^nq^  =  {-  f     {n-l){n-3){n-5)  1  x   (xxxvi). 

*  loc.  cit.  p.  5.  t  loc.  cit.  p.  4. 
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We  can  now  rapidly  calculate  the  q's. 

2tt  77  ^       /'Ztt  5  /27r 

_14    /2^  _2552  /2^ 

3  V  3  '  9  V  3  '  2rVT' 

This  is  probably  more  than  sufficient  for  most  practical  purposes.  Evaluating  the 
coefficients  numerically  we  have  from  (xxxiii) : 

^=  1  +  •607,9271r-'+  •140,7239r^+  -OSe.ZysSr" 

+  •010,25877-'+ -002,9933?''"   (xxxvii). 

To  test  the  accuracy  of  this  result — obviously  correct  for  r  =  0 — consider  r  =  l. 
We  have : 

{pjpj)r=i  =  1  -798,6788   (xxxviii). 

But  in  the  case  the  variate  correlation  surface  becomes  a  ridge  and  \  =  \,  or : 


-^J,^//^"^'^  80- 

feL=^=^-^  

The  difference  between  (xxxviii)  and  (xxxix)  is  only  '001,3212  or  about  -07  per 
cent.  Thus  even  if  we  omit  the  term  in  r^",  we  shall  be  less  than  '2  per  cent  in  error 
in  this  extreme  case,  when  the  probable  error  itself  is  zero ;  and  for  lesser  values  of  r, 
where  the  probable  error  is  sensible,  we  shall  not  be  as  much  as  "01  per  cent  in  error. 
This  is  amply  sufficient  for  statistical  purposes.  I  now  take  p^^  and  find  its  value  in 
a  different  manner. 

i?3i=7v7  i'izdxdy, 

-'■'J  -  00  J  -co 

This  can  be  integrated  twice  by  parts,  and  the  part  between  limits  vanishes  at 
each  integration.    Writing  x  =  o-, a?',  y  —  a^y'  as  before,  we  have: 

dp,,     3N'  r  +  < 


dr 

This  is  found  at  once  from  qn^^  ]      3x>'~'^'' '^n-\dx,  or  q..  =  -  I      ixde'"" ,  since  v.^_y^^x.  Thus 

J  -OO  i  -00 
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The  integration  with  regard  to  y'  can  now  be  completed  and  we  find  : 

dp,,      3N'  1 


4  — r2 

+  00      _  1  2  


where  jx=\  ^ 

0 


e  ^2-.^    j^^dx   (xl), 


and  we  have  dropped  the  dash  on  x  as  no  longer  of  service. 
Write  m  =  (4  —  J"'^)/('2  —  r),  and  we  must  now  find  : 


"^"^  12 


I=j      e-^'^'-'^j^dx   (xli). 

Now  : 

di       r+°°     1  2        1  (+' 


dm 


1  \[ye--'^'^'j:dx  =  ^  |^%j-^(e-*--^) 


or: 


1 

C  +00 

2m 

j  -oo 

I 

2m 

+  - 

m 

I 

—  i  7M  1'2 


(im     2m  (m  4- 1 )  j  -  00  m  (m  +  1 )  y^Tj,  +  2  ' 


thus:  ^Uml)=  

Thus:  v/m/=  constant -V2^  cos"'  —,-r   (xhi). 

m  +  1 

To  evaluate  the  constant*  put  m=  1,  and  we  have  : 
constant  =       +  j2Tr  cos"' 


3 


Or  finally  :  /  = 


+  s/7r/2 

n/2^  Ttt  1 


I     ,_  A^^^y.  +  ^/27^-=7rV^/2• 


V^m  \2  m  +  1 

V2^  .  _i  1 

sm   


.(xliii). 


Jm  m+l 

*  Mr  L.  F.  Richardson  has  shown  me  that  if  we  put  m  =  0,  since  the  inverse  cosine  now  vanishes 
 r+oo     1  _, 

constant  =  Limit  <t  —  co  of  v27r  I      -         e   ^  a'^  jx'^^i  which  he  has  evaluated  with  the  same  result. 
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Returning  now  to  (xl)  we  can  replace  m  by  its  value  in  terms  of  r  and  write 

d  (2xA     108      1       .  .  . 

^W^-^TT^''''  2^3:^)   (^1^^)- 

This  expression  I  have  not  succeeded  in  integrating.  I  have  therefore  expanded  it 
in  7^  and  then  integrated.  Since  p.,,  =  0  for  r  =  0,  we  see  the  constant  is  zero  after 
integration ;  thus  after  some  troublesome  expansions  I  find  : 

^,  =  ^{-839,8369r--005,4820r'--003,6798r'- -001, 18364  (^Iv). 

The  value  of  ^^13  is  clearly  the  same  as  p,^  for  nothing  would  be  altered  if  x  and  y 
were  interchanged  from  (xl)  onwards.  To  test  the  accuracy  of  the  result,  suppose 
r  =  1.    Then  we  have  from  the  '  ridge  '  : 

(i>3i)r=i  =  ^  J  ^  e- i-^'/'^i'  dx,  and  i  =  i, 

~n]_,,,'^  "^'^-80' 
or:  (—0)     =  r8,  again. 

But  (xlv)  gives  us  : 

'Pn 


=  1-8  X  1-00153, 

that  is  a  result  at  a  maximum  only  "15  per  cent,  in  error  and  correct  enough  for  all 
statistical  purposes.  The  next  step  is  to  determine  the  powers  of  r  in  terms  of  and 
substitute  in  the  expressions  just  found  for  pjp.,^  and  PsJpJ.    I  find  : 

1  +  -666,6667^'+  -108,308^+  -019,7955/)'+  -002,7683/3'  (xlvi), 

P^o 

and  :  ^=  l-947,1220p-  -123,4135/)'-  -019,4138/)'-  -003,8120/3'   (xlvii). 

P^o 

To  verify  we  note  that  for  p=l,  these  give  1"7975  and  1-8005  instead  of  1*8, — 
quite  sufficiently  close  for  the  purpose  in  view. 

We  now  substitute  in  (xxxi)  and  find  as  far  as     that  : 

S;  =  ^|l  _  1-827,5773/)^  + -688,4687p'+ •11-2,7773/)'+ -020,2900^^1  (xlviii). 

I  throw  this,  by  dividing  by  ( 1  —  p^f,  into  the  form  : 

S;  =  '^-^^{1 +  -086,2113/)^+ -012,9408/)^+ -002,3757/)'+ •000,0822/)'y, 

or,  dropping  unnecessary  decimals  : 

1,  =  ^~^  {1  + -086/)^+ -013/)^+ -002/)'}   (xlix). 
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Thus  we  see  that  the  distribution  of  grades  being  very  far  from  normal,  the 
probable  error  •67449Sp  of  the  correlation  of  grades  exceeds  the  value  •67449(1  —  p^)IJN, 
which  it  would  take  on  the  hypothesis  of  normal  correlation  by  a  factor  which  can 
amount  to  about  10  per  cent,  at  a  maximum,  but  gives  0  per  cent,  excess  when  p  =  0, 
then  agreeing  with  our  previous  result. 

I  propose  now  to  find  the  probable  error  in  r  as  determined  by  grade  methods  in 
terms  of  r.  This  involves  expressing  p  and  in  terms  of  r ;  these  are  easily  found 
from  the  known  expansions  for  sin~^a3  and  (sin~'cc)-.    We  have  : 

1  +ip'=  1  +  •455,9453r'+  •037,9954r*+  •005,0661r''4-  •000,8142r^ 

2/3=  l-909,8593r+ -079,57757'^+ •009,2650?-'+  •001,3322r^ 

These  must  be  used  in  (xxxi),  which  may  be  written  in  the  form  : 


Hence  using  (xxxviii)  and  (xlv),  we  deduce  after  some  troublesome  multiplications: 
t;  =;^{1  -  l-666,5507r'+  •433,6130r^  +  •161,8337?'"  +  •049,5042?-«} 

=  -^(1  -  i^^Y  {1  +  •333,4493^-=+  •100,5116r^+  •029,4076;-"+  •007,8078r'}. 

TT 

But  since  :  ?•  =  2  sin  -  p, 

6 

Sr  =  ^cos^pxSp  and  2^  =  ^\/ 1  _  ^  S^. 

Thus : 

-  {1  +  ■083,4493r^+  •017,1493r'+  •004,2797r''+  •000,4559r'}. 

Taking  the  square  root  we  have  : 

2,=  r0472  (1  +  -041,7246?^^+  •007,7042?-^+  •001,8184r^  +  •000.1224r«}, 

or,  for  all  practical  purposes,  the  probable  error  of  r  found  from  the  grade  correlation 
is, 

p.E.  of  ?-=^70633^^  {1 +  ^042?^'  +  -008?''+ •002?-"}  (1). 

Clearly  for  all  values  of  r,  this  is  larger  than  the  probable  error  of  the  correlation 

1  — 

r  found  by  the  product  moment  method,  i.e.  '67449  — nrr  •    The  maximum  difference, 

si  -Ly 

as  r  approaches  unity  is  10  per  cent.  The  value  can  always  be  found  from  (1)  without 
any  trouble.    The  completer  value  is  singularly  close  to 

•70633    1  - 
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but  no  advantage  is  gained  in  calculation  by  using  this  form,  as  tables  of  powers  of  r 
up  to  the  6th  exist 

We  see  therefore  from  this  section  that  whatever  be  the  value  of  r,  then  for 
normal  frequency  the  probable  error  of  r  found  by  the  product  moment  method  is  less 
than  the  value  found  by  the  correlation  of  grades.  Further  there  is  no  reason  for 
supposing  that  the  probable  error  of  r  found  from  the  difference  of  grades  {R)  is  not 
greater  than  the  probable  error  of  r  found  from  the  product  moment  of  grades. 

We  accordingly  conclude  that  the  new  methods  are  less  accurate  than  the  old. 
But  they  possess  some  advantages, — when  ranks  can  be  easily  determined, — in 
rapidity  of  calculating,  and  there  are  undoubtedly  cases  where  they  can  be  used 
effectively.  In  saying  this  I  must  reassert  that  I  do  not  believe  there  is  any  advantage 
in  the  knowledge  of  rank  correlation  in  itself ;  I  look  upon  it  as  a  mere  stage  to  the 
discovery  of  the  variate  correlation.  For  the  comparability  of  rank  correlations 
depends  upon  the  sameness  of  type  in  the  frequency  distributions,  and  this  assumption 
is  the  weak  step  in  the  method.  Granted  approximately  normal  distributions,  then 
the  variate  correlation  flows  from  the  rank  correlation,  and  the  whole  investigation 
gains  a  rich  significance. 

My  remaining  sections  will  be  devoted  to  illustration  of  the  new  methods  and 
their  comparison  with  the  old. 

(9)    Illustration  III.    Correlation  of  National  Debt  and  Population. 
The  following  table  is  based  on  data  for  the  year  1900,  and  raises  no  pretence  to 
exactness,  or  financial  accuracy.    It  is  mej^ely  illustrative. 


Table  IV.    Population  and  Indebtedness  of  Various  States  1900. 


state 

Population 

Debt  in 

Population 

Debt 

in  millions 

million  £ 

Bank 

Bank 

Russia 

129.20 

1097-0 

1 

2 

-1 

1 

United  States 

76-40 

200-0 

2 

8 

-6 

36 

German  Empire  t 

56-34 

649-4 

3 

4 

-  1 

1 

Austria 

47-01 

226-7 

4 

7 

-3 

9 

Japan 

43-80 

51-5 

5 

15 

-10 

100 

United  Kingdom 

41-60 

705-0 

6 

3 

+  3 

9 

France 

38-64 

1242-1 

7 

1 

+  6 

36 

Italy 

32-10 

500-0 

8 

5 

+  3 

9 

Turkey 

20-30 

162-0 

9 

9 

0 

0 

Spain 

18-10 

385-0 

10 

6 

+  4 

16 

Belgium 

6-82 

106-4 

11 

11 

0 

0 

Rouuiania 

5-50 

58-0 

12 

14 

-  2 

4 

Sweden 

5-14 

18-6 

13 

17 

-4 

16 

Holland 

5-10 

95-6 

14 

12 

+  2 

4 

Portugal 

4-70 

155-0 

15 

10 

+  5 

25 

Argentine 

4-50 

86-4 

16 

13 

+  3 

9 

Switzerland 

3-30 

3-6 

17 

20 

-3 

9 

Greece 

2-40 

28-0 

18 

16 

+  2 

4 

Norway 

2-20 

12-7 

19 

18 

+  1 

1 

Denmark 

2-18 

11-6 

20 

19 

+  1 

1 

30  =  S{v,-Vo) 

290  =  S{v,-v,Y 

*  See  Biometrika,  Vol.  ii.  p.  474.  f  Imperial  debt  and  sum  of  state  debts. 
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Hence:  lY'-  1  =  399,  and  /)i,=  1  - 6  x  290/(20  x  399)  =  7820. 

Further:  i2=  1  -  6  x  30/399  = -5489. 

These  values  are  obtained  in  a  few  minutes,  if  the  ranks  have  once  been  written 
down.  If  Pj.j  only  be  required,  we  need  not  write  down  the  v^-~v.,  column  at  all,  the 
squares  being  placed  down  straight  away  from  the  rank  columns. 

Now  applying  equations  (xxiv)  we  determine  : 

r=7962,  found  from  p^^, 

=  •7810,  found  from  R. 

The  probable  error  of  r  found  from  p^.,  as  given  by  Equation  (1)  is  '0596.  Thus  we 
conclude  that 

7-  = '80  +  '06,  found  from 
=  78  ±  >  -063,  found  from  R*. 

If  we  turn  to  the  much  more  laborious  method  of  moments,  we  find  : 

Mean  Population  =27'26  millions;   Mean  Debt  =2897  million  £, 

S.  D.  Population  =3174  millions;   S.  D.  Debt  =357-9  miUion  £. 

Now  these  results  in  themselves  should  be  suflScient  to  warn  us  that  both  distri- 
butions are  very  far  from  normal ;  for  the  S.  D.'s  in  both  cases  are  greater  than  the 
means,  and  since  in  a  normal  distribution,  we  might  easily  have  a  deviation  equal  to 
the  S.  D,  we  should  on  that  hypothesis  expect  to  get  negative  debts  and  negative 
populations.  The  distributions  are  therefore  very  skew,  or  in  clubbing  together  great 
and  small  powers,  we  have  introduced  excessive  heterogeneity,  completely  destroying 
any  approach  to  normality  t.  If  we  work  out  the  value  of  r  by  the  product  moment 
method,  we  find  : 

r=-68±-08. 

We  see  at  once  that  the  rank  method  has  so  exaggerated  the  correlation  that  it 
has  made  the  probable  error  of  the  less  exact  methods  less  than  the  probable  error  of 
the  more  exact  method  !  The  explanation  of  this  lies  simply  in  the  fact  that  the 
system  we  are  dealing  with  is  not  normal.  If  the  ranks  of  two  variables  were  those 
given  in  Table  IV,  and  the  distribution  were  normal,  then  the  variate  correlation 
would  be  '80  ;  it  actually  takes  the  value  '68,  and  this  is  a  very  good  illustration  of 
how  much  the  nature  of  the  distribution  may  afi'ect  a  judgment  from  ranks. 

'7738 

*  The  p.  e.  is  of  the  form  — -, —  (1  —  +  c-^r^  +  c^r*  ■[■  Cgr^),  the  c's  being  positive  unknown  constants, 
and  this  is  >  "OGS. 

t  If  we  confine  our  attention  to  the  seven  "  great  powers,"  Austria,  France,  Germany,  Great  Britain, 
Italy,  Russia  and  the  United  States,  we  find  pi2=:— "143,  ^  =  — '125,  giving  r  =  — -15  and— -23  with  a 
probable  error  of  -3;  this  result  again  emphasises  the  heterogeneity  of  the  material. 
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Of  course  it  is  doubtful  whether  when  we  are  in  ignorance  of  the  character  of  the 
distribution  we  could  say  more  than 

r=*8  +  "l,  found  from  p^.,, 

and  >'  =  *7  +  "l,  found  by  product-moment. 

These  might  then  be  treated  as  identical  for  some  purposes  of  inference.  But  the 
advantage  of  the  longer  product-moment  method  would  be  that  it  would  have  taught 
us  that  the  correlation  was  non-Gaussian,  and  given  us  in  the  process  the  regression 
line.    This  would  probably  more  than  compensate  for  its  greater  laboriousness. 

(10)  Illustration  IV.  Correlation  betwee7i  mean  Size  of  Litter  in  a  Generation 
and  mean  Sex  Ratio  in  the  same  Generation  in  the  case  of  Mice.  .  - 

The  following  data  are  taken  from  a  paper  in  Biometrika,  Vol.  v.,  p.  439. 


Table  V. 


Generation 

Mean  size 
of  Litter 

Mean  Sex 
Batio 

Litter 
Bank 

Sex  Batio 
Bank 

1st 

5  06 

■505 

5 

3 

+  2 

4' 

2ncl 

4-94 

•491 

6 

4 

+  2 

4 

3rd 

5-96 

•523 

1 

2 

-  1 

1 

!  4th 

5  93 

•542 

2 

1 

+  1 

1 

5  th 

5-53 

•462 

3 

6 

-3 

9 

6tli 

5-23 

•483 

4 

5 

^1 

1 

Thus:  S  {y-^  — v.^^  =  2Q,    S{v^  —  v.^=5, 

and  /Oj,=  -429,  R=-IAS. 

Whence:  r  from  py,  =  '45  ±'23, 

r  from  E  =  '25±  <  -23. 
The  actual  value  of  r  from  product-moment  is 

r=-63  +  -17. 

This  example  serves  to  show  that  the  correlation  found  from  R  may  when  the 
observations  are  few,  not  be  definitely  significant,  while  when  we  proceed  in  the  more 
accurate  manner  it  is  definitely  significant.  The  i^-method  is  thus  shown  not  to 
have  special  advantages,  but  rather  peculiar  disadvantages  for  short  series.  Its  merit 
really  lies  in  rapidity  of  working  for  assay  purposes  and  rough  treatment. 

(11)    Illustration  V.    Resemblance  of  Cousins. 

{a)  Width  of  Hand.  The  following  table  gives  the  width  of  the  hand  in 
34  pairs  of  male  adult  cousins  taken  from  my  series  of  Cousin  Measurements.  These 
data  are  being  used  by  Miss  Ethel  M.  Elderton  in  a  forthcoming  paper  on  this 
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subject,  and  I  have  most  heartily  to  thank  her  for  the  exhaustive  manner  in  which 
she  has  dealt  with  the  material  in  order  to  illustrate  the  whole  subject  of  deter- 
mining correlation  by  ranks. 


Table  VI.     Width  of  Hand  in  mm.  in  Pairs  of  Male  Adult  Cousins. 


1st  cousin  X 

2nd  cousin  y 

Eank  A 

Eank  B 

"  "2 

("t  "2)' 

True  Grade  of  A 

True  Grade  of  B 

80-7 

80-0 

23 

17 

17 

23 

6 

36 

23-51 

20-74 

20-74 

23-51 

90-0 

80-0 

58 

17 

17 

58 

41 

1681 

58-82 

20-74 

20.74 

58-82 

80-7 

84-7 

23 

48" 

48 

23 

-25 

625 

23-51 

40-66 

40-66 

23 -.91 

90-0 

84-7 

58 

48 

48 

58 

10 

100 

58-82 

40-66 

40-66 

58-82 

80-0 

84-7 

17 

48 

48 

17 

-31 

961 

20-74 

40-66 

40-66 

20-74 

74-5 

810 

3 

26 

26 

3 

-23 

529 

5-52 

24-74 

24-74 

5-52 

81-0 

80-0 

26 

17 

17 

26 

9 

81 

24-74 

20-74 

20-74 

24-74 

86-0 

81-0 

52 

26 

26 

52 

26 

676 

46-00 

24-74 

24-/  4 

4  6  -00 

80-7 

83-7 

23 

43 

43 

23 

-20 

400 

23-51 

36-36 

36-36 

23-51 

94-0 

82-7 

64 

37 

37 

64 

27 

729 

65-26 

31-99 

31-99 

65-26 

94-0 

81-7 

64 

34 

34 

64 

30 

900 

65-26 

27-68 

27-68 

65-26 

76  0 

77-0 

5 

11 

11 

5 

-  6 

36 

8-44 

10-90 

10-90 

8-44 

76-0 

79  0 

5 

16 

16 

5 

-11 

121 

8-44 

17-08 

17  "08 

8-44 

76-0 

83-0 

5 

41 

41 

5 

-36 

1296 

8-44 

33-29 

33-29 

8-44 

86-3 

88-3 

Co 

Ol 

Ot 

06 

-  1 

1 

11  ID 

t) 4-  in 

•J4:  ID 

4:  (  10 

92-5 

85-0 

60 

51 

51 

60 

9 

81 

63-51 

41-94 

41-94 

63-51 

83-7 

81-7 

43 

34 

34 

43 

9 

81 

36  36 

27-68 

27-68 

36-36 

83-7 

83-3 

43 

42 

42 

43 

1 

1 

36-36 

34-62 

34-62 

36-36 

83-7 

78-7 

43 

15 

15 

43 

28 

784 

36-36 

16-05 

16-05 

36-36 

82-0 

81-0 

36 

26 

26 

36 

10 

100 

28-95 

24-74 

24-74 

28-95 

bU  0 

O  A 

cO  U 

22 

17 

17 

22 

.5 

AD 

22-71 

20-74 

20-74 

22-71 

75-0 

76-0 

4 

5 

5 

4 

-  1 

1 

6-10 

8-44 

8-44 

6-40 

71-0 

760 

1 

5 

5 

1 

-  4 

16 

1-70 

8-44 

8-44 

1-70 

73-0 

77-0 

2 

11 

11 

2 

-  9 

81 

3-45 

8-44 

8-44 

3-45 

84-5 

78-0 

47 

13 

13 

47 

34 

1156 

39-81 

13-78 

13-78 

39-81 

760 

78-0 

5 

13 

13 

5 

-  8 

64 

8-44 

13-78 

13-78 

8-44 

93-3 

89-7 

61 

55 

55 

61 

6 

36 

64-53 

58-09 

58-09 

64-53 

93-3 

82-7 

61 

11 

61 

24 

576 

64-53 

31-99 

31-99 

64-53 

98-7 

82-7 

66 

l\ 

66 

29 

841 

67-58 

31-99 

31-99 

67-58 

89-7 

81-0 

55 

26 

26 

55 

29 

841 

58-09 

24-74 

24-74 

58-09 

810 

93-3 

26 

61 

61 

26 

-35 

1225 

24-74 

64-53 

64-53 

24-74 

82-7 

81-0 

37 

26 

26 

37 

11 

121 

31-99 

24-74 

24-74 

31-99 

98-7 

81-0 

66 

26 

26 

66 

40 

1600 

67-58 

24-74 

24-74 

67-58 

98-7 

89-7 

66 

55 

55 

66 

11 

121 

67-58 

58-09 

58-09 

67-58 

Mean  ) 
Size  j" 

-83-16 

Mean  | 
Rank  J 

=  34 

=  605 

=  2x  1 5923 

Mean  "j^ 
GradeJ  " 

32-41 

The  measurements  were  only  read  to  the  millimetre,  but  since  measurements  were 
taken  two  or  three  times  in  each  case  the  fractions  *3,  "5  or  "7  arise,  when  averaging. 
Since  either  cousin  may  be  the  "first "  cousin,  we  have  for  a  symmetrical  table  68  pairs. 
In  the  third  and  fourth  columns,  we  have  the  ranks  placed,  according  as  to  which 
cousin  is  considered  the  "  first."  It  will  at  once  be  obvious  that  many  ties  arise  ; 
thus  no  less  than  eight  individuals  tie  with  a  width  of  hand  81  mm.  at  rank  26.  It 
is  not  so  clear  what  rank  ought  to  be  given  to  them.  They  run  from  26  to  33, 
we  may  call  them  all  29*5.    We  shall  speak  of  this  as  the  mid-rank  method.    Or,  we 
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might  put  them  all  at  26,  because  this  would  probably  be  the  result  nearest  to  the 
true  grade*.    We  shall  speak  of  this  as  the  bracket-rank  method f. 

The  above  table  illustrates  the  work  for  the  bracket-rank  method  in  columns 
5  and  6,  the  differences  of  ranks  A  and  B  being,  however,  only  written  down  once,  so 
that  to  find  S{v^  —  v.:^,  we  must  sum  all  quantities  in  the  fifth  column  as  if  they  had 
the  same  sign,  and  double  the  sum  of  their  squares  in  the  sixth  column. 

We  find:  i^=-2148  and  /),,  =  -3922, 

whence  r  from  pv2  =  '^08  ±  '072, 

r  from      =-361  +  >  -072. 

If  we  now  investigate  the  value  of  R  and  p^,  from  the  mid-ranks,  we  find  that 
S {y,  - 1^,)  =  588  and  S  {v,  -  v.^'  =  2981 2.    Accordingly : 

/^=-2369,  and  p,,=  -4310. 
Whence:  r  from  /)„  =  *448+  "069, 

r  from         '396  +  > -069. 

Both  these  values  for  r  are  higher  than  those  determined  by  the  bracket-rank 
process.  We  must  then  question  whether  the  mid-rank  or  the  bracket-rank  method 
is  the  better.  Or,  indeed  is  it  not  possible,  that  sometimes  the  one,  and  sometimes 
the  other  will  be  the  closer  according  to  the  nature  of  the  frequency  distribution  ? 

To  illustrate  this  point  the  actual  grades  on  the  basis  of  normal  distribution  have 
been  calculated  by  Eqn.  (xii).  It  must  be  remembered  that  '5  has  to  be  added  to 
the  grade  to  obtain  the  rank,  Eqn.  (xiii). 

We  find  :  Mean  width  of  hand  =  83*16  mm. 

Standard  Deviation  =  6*201  mm. 

As  illustration  of  the  method  consider  the  hand  of  width  84*7  mm.,  its  deviation 
is  1'54  and  the  ratio  of  this  to  the  S.D.  =  "248,  this  corresponds  to  a  value  of  (1  -f  a), 
in  the  notation  of  Sheppard's  Tables,  ='59793  and  multiplied  by  68  gives  the  grade 
40"66,  corresponding  to  a  rank  41  "1 6,  as  against  the  observed  rank  48  or  a  mid-rank 
49 !  Thus  the  actual  size  of  organ  corresponding  to  a  bracket  rank  may  differ  widely 
from  the  size  really  belonging  to  the  ranked  organ,  or  the  true  grade  in  a  general 
population  differ  very  considerably  from  the  spurious  grade  or  rank  in  the  sample 
used.  This  point  again  indicates  how  little  can  be  judged  from  ranks  unless  we 
associate  the  rank  distribution  with  some  frequency  hypothesis. 

Having  found  the  true  grades  we  may  correlate  them  together  to  find  p^^^,  but  in 
using  the  formula 


P 


2iVX  (Tg 


*  That  is,  find  o-j  and  calculate  gj  and  g.2  iYom  Eqn.  (xii)  p.  10 ;  the  true  grade  in  this  case  is  24-74,  and 
Vi  =  5rj  +  "5  =  25'24  is  even  below  26,  not  above  it. 

t  To  adopt  a  term  from  the  examination  world,  where  the  place  number  of  the  bracket  is  measured 
only  by  those  above. 
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we  may  adopt  either  the  theoretical  vak;e  -^N-  for  or/,  or  we  can  actually  calculate 
its  value.  Now  ^^"-  =  385^  and  o-/  =  365*94,  and  thus  there  is  a  very  considerable 
deviation  from  normality  in  the  series*  *S  (^j  — ^,)-  =  31l53'195,  and  thus: 

/Oj.  found  from  the  true  o-/=*3740, 

/>,^  found  from  crg'  =  Y^J^'^  ='4055. 

Whence:  r  from  true  o-/=*3890, 

r  from  o-/  =     =  *42 1 5. 

If  we  might  judge  from  this  single  case  we  should  conclude  that  the  bracket-rank 
method  gave  a  closer  result  to  the  grade  method  than  the  mid-rank  method.  But  the 
question  now  arises,  how  close  after  all  are  all  these  grade  rank  methods  to  the  corre- 
lation coefficient  in  any  short  series  such  as  the  present  ? 

Accordingly  the  series  was  worked  out  by  product  moment  and  the  result  obtained 

was 

r  = -331  + -073. 

Thus  we  see  that  the  actual  correlation  is  considerably  lower  than  that  given 
by  any  of  the  rank  or  grade  processes.  It  is  perfectly  true  that  "33  and  '45  are 
within  double  the  probable  error,  and  therefore  two  different  random  samples  of  the 
real  population  might  have  given  as  widely  divergent  results.  But  this  is  really  the 
case  of  two  different  methods  applied  to  the  same  sample.  And  further  the  actual 
correlation  tells  us  that  as  far  as  this  sample  is  concerned  the  true  answer  is  likely  to 
lie  between  '19  and  '48,  but  the  mid-rank  method  tells  us  that  it  is  likely  to  lie 
between  '31  and  •58t.  Now  it  is  clear  we  might  for  some  extraneous  reason  hold  the 
value  likely  to  be  "56,  and  we  should  find  nothing  to  contradict  this  in  the  mid-rank 
result.  But  the  proper  method  of  determining  r  would  show  us  that  such  a  value 
was  itself  very  unlikely.  Thus  the  latter  method  when  it  diverges  less  than  twice 
the  probable  error  from  the  result  of  the  rank  method  may  yet  forbid  us  to  interpret 
the  results  in  a  manner  admissible  on  the  rank  method.  We  cannot  argue  in  like 
manner  from  the  grade  or  rank  result  because  that  method  has  assumed  an  hypothesis, 
not  made  in  the  product-moment  treatment,  i.e.  that  of  normal  correlation,  which  is 
here  not  justified  by  the  results. 

But  even  the  amount  of  agreement  here  noted  is  to  be  considered  rather  excep- 
tional. I  owe  to  Miss  Elderton  the  working  out  of  three  other  pairs  of  characters  in 
the  same  set  of  male  cousins  each  in  five  different  ways.  I  have  myself  done  each 
of  them  in  three  more  ways,  namely  by  Variate  Differences  as  in  Art.  2,  and  by  the  R 
method.    The  results  are  given  in  the  Table  below. 

*  The  mean  grade  in  fact=  3  2 '41  and  not  34  also. 

t  Taking  a  range  of  twice  the  probable  error  on  either  side  the  means. 
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Table  VJI.    Comparison  of  Correlation  Coefficients  found  by  Various  Methods. 
Resemhlance  of  Hand  in  68  Pairs  of  Male  Cousins. 


Grades 

Banks 

Character 

Product 

Variate 

Bracket-Eank 

Mid -Bank 

Moment 

Difference 

True  a-g- 

By  iJ 

By  ft2 

Byi? 

Width  of  Hand 
Width  of  Wrist 
Length  of  Index  Finger 
Length  of  Little  Finger 

•33  +  •07 
•17  + •OS 
•19+  •OS 
•29  + -075 

•37 
•25 

•u 

•26 

•39 
•12 
•10 

•IS 

•42 

•22 
•13 
•30 

•41  +  ^07 

•07  +  ^085 
•21  +  •OS 
•20  +  -08 

•36 

•05 
•29 
•19 

•45  +  ^07 

•08  +  ^085 
•19  + -085 
•24  ±  •OS 

•40 
•03 
•29 
•21 

Mean  of  Four  Results 

•25 

•25 

•21 

•25 

•22 

•22 

•24 

•23 

Root  Mean  Square  Deviation 
from  true  r 

•053 

•069 

•060 

•079 

•094 

■079 

•096 

It  will,  I  think,  be  clear  from  this  table  that  for  series  even  with  as  many  as 
68  pairs — and  this  is  approaching  the  limit  at  which  any  time  is  gained  by  using 
rank  methods — we  cannot  hope  to  ascertain  the  correlation  of  the  sample  by  such 
methods  within  about  "1  of  its  value,  and  as  the  probable  error  of  the  sample  may  be 
'07,  we  may  well  deviate  '2  from  the  population  value  in  our  estimate.  We  are 
accordingly  very  unlikely  to  reach  reliable  results  by  rank  methods  for  the  8  to  1 0 
observations  to  which  Dr  Spearman  proposes  to  apply  his  jR-method.  We  see  that 
the  mean  values  are  fairly  close,  although  the  variate  difference  and  the  second  grade 
methods  give  the  best  results.  Judged  by  mean  square  deviations  from  product 
moment  results,  the  variate  difference  is  easily  first,  then  come  the  laborious  grade 
methods,  the  rank  methods  by  about  fifty  per  cent,  worse  than  the  variate 
difference,  and  lastly  the  R  methods  not  quite  100  per  cent,  worse.  Thus  we  note 
that  when  a  series  is  not  fairly  long  and  not  approximately  normal,  the  diflPerent  rank 
and  grade  methods  will  give  very  diverse  results.  But  when  a  series  is  fairly  long, 
say  100  or  more  observations,  then  there  is  no  advantage  in  rapidity  from  the  rank 
method ;  the  formation  of  a  grouped  correlation  table,  and  the  use  of  the  product 
moment  is  just  as  rapid,  and  further  conveys  a  great  deal  more  of  valuable  information. 

(12)  Conclusions.  Three  new  methods  of  determining  variate  correlation  have 
been  given  in  this  paper.  The  first,  that  of  variate  differences,  seems  likely  to  be  of 
some  service  in  the  case  of  symmetrical  tables  containing  large  numbers,  the  frequency 
being  approximately  normal,  homotyposis  tables  may  be  taken  as  illustration. 

The  second  that  of  deducing  variate  correlation  from  correlation  of  ranks,  may  be 
of  service  when  it  is  not  possible  to  put  a  quantitative  value  on  the  individual 
character.  Thus  it  might  be  easy  to  form  a  relative  series  of  intensity  of  pigment, 
and  place  individuals  in  rank.    But  mere  correlation  of  ranks  is  not  in  itself  a  com- 
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parable  character,  as  the  variate  correlation  may  have  widely  different  values  for  the 
same  ranking.  Justification  for  the  comparability  depends  upon  assuming  a  wide 
spread  rule  of  frequency  distribution,  and  this  rule  can  hardly  be  other  than  normality. 
The  present  paper  shows  how  to  deduce  variate  correlation  from  correlation  of  ranks. 
It  shows,  however,  that  such  a  method  of  reaching  variate  correlation  is  considerably 
less  exact  than  the  usual  product-moment  method.  There  is  no  gain  in  accuracy,  but 
the  reverse  in  using  such  a  method  in  the  case  of  short  series. 

Thirdly,  the  method  proposed  by  Spearman  of  deducing  the  correlation  of  ranks 
from  the  positive  differences  of  ranks  is  discussed,  and  the  error  of  the  process  by 
which  he  has  deduced  for  it  an  accuracy  greater  than  that  of  the  more  usual  methods 
of  finding  correlation  is  indicated.  A  method  for  deducing  variate  correlation  from 
positive  difference  of  ranks  is  indicated.  The  method  is  very  rapid  for  short  series, 
say  those  not  exceeding  20  observations,  but  it  is  less  accurate  than  the  product- 
moment  method,  and  considerable  changes  in  the  final  value  reached  will  be  found  to 
arise  according  as  we  use  bracket-ranks  or  mid-ranks  in  the  case  of  ties.  The 
comparison  with  true  grades  for  a  few  special  cases,  does  not  enable  us  to  say  which 
is  the  better  method ;  the  deviations  from  normality  sometimes  appear  to  make  one, 
sometimes  the  other,  the  closer  to  the  true  correlation. 

In  conclusion,  I  think,  we  may  say  that  variate  correlation  found  by  ranks  may . 
prove  to  be  a  useful  auxiliary  method  of  dealing  with  correlation,  when  it  is  needful 
to  give  a  rough  answer  to  a  problem  in  a  brief  time,  or  when  the  material  itself  is 
incapable  of  being  accurately  measured.  In  all  such  cases  mean  square  of  rank 
differences  will  be  more  accurate  than  mean  positive  rank  difference.  But  both 
methods  must  be  used  with  caution,  and  their  easy  application  must  not  lead  us  to 
approve  exaggerated  statements  as  to  their  accuracy. 
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